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ABSTRACT 


Modem  Counter-Insurgency  (COIN)  and  Irregular  Warfare  (IW)  are  increasingly 
complex.  Contributing  to  this  complexity  is  the  need  to  develop  and  maintain  a  mental 
map  of  relevant  environmental  and  historical  factors  and  their  interactions,  generated 
from  disparate  sources  of  information  that  must  be  organized,  processed  and 
integrated.  Compounding  this  challenge  is  the  fact  that  mental  pictures  cannot  easily 
be  passed  from  one  soldier  to  the  next.  This  is  a  problem  when  the  tactical  situation 
dictates  frequent  changes  in  unit  Areas  of  Operations  (AOs),  and  particularly  in  cases 
where  units  rotate  on  a  regular  basis.  When  units  hand  over  an  AO,  the  incoming  unit 
must  quickly  rebuild  a  mental  picture  and  narrative  of  its  operating  environment. 
Because  of  this,  historical  organizational  knowledge  is  lost  that  could  otherwise 
increase  combat  effectiveness  and  reduce  casualties. 

This  thesis  discusses  a  prototype  architecture  for  a  system  that  will  enable  a 
vehicle  crew  commander  to  spatially  input,  organize  and  view  fused  tactical 
information  through  placement  of  3D  interactive  symbols  directly  into  the  real-life  on¬ 
site  scene  from  the  vehicle  perspective.  A  panoramic  camera,  dashboard  monitor 
and  head  tracker  give  the  commander  a  complete  view  of  the  vehicle  surroundings  for 
improved  situational  awareness,  and  a  360-degree  LiDAR  scanner  supplies  depth 
information  for  accurate  annotation  geo-location.  This  system  is  intended  to  generate 
greater  situational  understanding  of  the  complex  environment  present  in  COIN 
operations,  in  order  to  allow  greater  performance  and  survivability  of  the  vehicle  crew. 
Such  a  system,  if  fielded,  can  create  the  ability  to  add  numerous  other  capabilities  to 
the  combat  vehicle  crew. 
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I.  INTRODUCTION 


Since  the  fall  of  the  Soviet  Union  in  the  early  1990s,  the  United  States  military 
has  found  itself  involved  in  conflicts  that  primarily  fall  on  a  lower  position  on  the 
operational  spectrum  than  conventional  high-intensity  combat.  Names  for  these  types 
of  conflict  change,  but  associated  terms  include  Low  Intensity  Conflict  (LIC), 
Counterinsurgency  (COIN),  and  Military  Operations  Other  Than  War  (MOOTW). 
Success  in  this  type  of  modern  combat  is  increasingly  dependent  on  the  flow  of 
information.  Compounding  the  difficulty  of  this  situation  are  the  circumstances  found 
in  a  low-intensity  combat  situation,  such  as  the  counterinsurgency  (COIN)  we 
currently  conduct  in  Afghanistan  and  Iraq.  In  this  environment,  the  necessity  to  have 
situational  understanding  involving  the  civilian  populace  greatly  increases  the  difficulty 
of  operations,  because  social-cultural  knowledge  is  difficult  to  describe  and 
communicate.  For  example,  it  is  useful  to  know  if  the  house  a  user  is  looking  at  has 
been  searched  by  previous  units,  and  what  was  found  during  the  search. 

This  thesis  describes  the  design  of  a  system  incorporating  Augmented  Reality 
(AR)  to  make  tactically-relevant  information  available  to  combat  and  patrol  vehicle 
commanders  in  an  operational  setting.  The  focus  of  this  research  and  prototype 
system  development  is  to  integrate  spatially  related  data  into  an  indirect  view  of  the 
outside  environment.  Street  names,  building  information,  blue  force  platforms  and 
intelligence  data  are  fused  with  the  video  from  vehicle-mounted  cameras. 


1 


Figure  1  Unmodified  view  of  urban  Baghdad 

Terrain-associated  knowledge  persists  in  the  environment,  rather  than  being 
verbally  relayed,  stored  in  text  documents  or  on  paper  maps,  or  being  lost  entirely. 
Crucial  information — unobtrusively  displayed  at  the  right  moment  and  place — allows  a 
vehicle  crew  to  better  understand  their  operational  environment,  to  be  aware  of 
threats  that  may  be  present,  and  ultimately  to  improve  situational  awareness  and  crew 
safety.  Generally,  we  wish  to  transform  the  view  in  Figure  1  into  the  view  in  Figure  2, 
and  display 

The  following  chapter  explains  the  operational  problems  we  are  trying  to 
address,  as  well  as  basic  concepts  of  AR.  Chapter  III  is  a  literature  review,  in  which 
we  present  currently  deployed  systems  and  their  capabilities  and  limitations.  Chapter 
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IV  is  the  analysis  of  the  system  requirements  and  prototype  design.  Chapter  V 
describes  our  plans  for  future  work.  The  last  chapter  summarizes  our  thesis  and 
presents  our  conclusions. 


Figure  2  Conceptual  view  through  goal  system 
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II.  BACKGROUND 


A.  OPERATIONAL  PROBLEMS 

1.  Persistence  of  Knowledge  and  Understanding 

In  our  current  operational  theaters,  responsibility  for  a  particular  Area  of 
Operations  (AO)  changes  frequently,  due  either  to  scheduled  deployment 
rotations  or  to  unit  moves  within  theater  stemming  from  changes  in  operational 
requirements.  This  flux  tends  to  create  gaps  in  area  knowledge  for  the 
responsible  unit.  Outgoing  units  have  a  good  working  knowledge  of  the  area, 
providing  the  context  within  which  to  operate.  Incoming  units  lack  this  knowledge 
and  context.  Units  fresh  to  an  AO  interpret  their  surroundings  differently  than 
units  that  are  veteran  to  the  area.  While  the  veteran  unit  is  able  to  interpret 
environmental  cues  in  a  manner  moderated  by  its  experience,  the  new  unit  is 
lacking  such  nuanced  information. 

The  current  method  of  information  exchange  between  rotating  units 
generally  involves  two  activities,  which  we  will  refer  to  as  — krie-alongs”  and  — dta 
dumps.”  Ride-alongs  involve  the  new  unit  leadership  participating  as  observers 
as  the  outgoing  unit  conducts  operations,  thereby  gaining  exposure  to  the  AO, 
and  some  verbal  transfer  of  historical  and  situational  knowledge.  The  — cfa 
dump”  refers  to  the  outgoing  unit  providing  a  massive  amount  of  digital  historical 
data  in  the  form  of  slide  shows,  documents  and  images,  saved  on  either  hard 
disk  drives  or  removable  media  such  as  CD-ROMs.  This  is  usually  an 
unsatisfactory  method  of  information  conveyance:  the  mere  fact  that  the  data  is 
now  in  control  of  the  incoming  unit  is  very  different  from  that  unit’s  understanding 
of  the  data  and  even  more  so  from  its  being  able  to  utilize  the  data.  Furthermore, 
there  also  is  a  need  for  more  accurate  and  precise  tactical  data  collection  in 
COIN  operations,  both  for  trend  analysis  and  prediction  as  well  as  feedback  on 
performance  for  operating  small  units. 
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The  precision  and  accuracy  of  spatio-temporal  data  about  events  on  the 
battlefield  often  are  hampered  by  the  necessity  to  rely  on  memories  of  individuals 
who  witnessed  the  event.  Anecdotal  recollections  tend  to  be  inaccurate  or  falsely 
precise,  and  this  limitation  perpetuates  throughout  the  information  sharing 
structure,  resulting  in  incorrect  target  location  and  inaccurate  data  collection. 
Since  data  analysis  tends  to  be  vulnerable  to  a  — grbage-in,  garbage-out” 
phenomenon,  improving  the  means  of  collection  for  more  accurate  and  more 
precise  data  should  have  far-ranging  implications. 

In  fact,  very  little  information  currently  is  collected  in  operational  settings, 
and  units  do  not  have  tools  to  review  properties,  timing  and  location  of  events. 
This  is  in  contrast  to  training  settings,  where  Observer/Controllers  are  viewing  the 
unit’s  performance,  and  various  automated  instruments  are  available  for  tracking 
the  elements  of  the  unit,  enabling  playback  and  review  of  training  events  for 
after-action  review  (AAR).  For  instance,  it  is  only  on  exceptionally  rare  occasions 
that  actual  lEDs  are  recorded  in  images  prior  to  exploding,  yet  those  are 
incredibly  valuable  for  training  and  analysis  purposes. 

2.  Constrained-View  Situational  Awareness 

The  view  of  the  external  world  from  within  a  tactical  vehicle  is  limited  due 
to  the  necessity  of  surrounding  combat  vehicles  with  armor  to  protect  the 
occupants.  For  instance,  an  Ml  114  up-armored  High  Mobility  Multipurpose 
Wheeled  Vehicle  (HMMWV)  is  surrounded  by  armor  plating  and  armored  glass. 
The  armor  helps  protect  the  occupants,  but  results  in  very  limited  visibility.  The 
crew  in  the  front  seats  has  best  visibility  through  the  forward  60-degree  horizontal 
arc,  with  visibility  more  limited  through  the  smaller  side  windows,  and  limited 
even  further  for  the  crew  in  the  rear  seats  (see  Figure  3  ).  Because  of  the  limited 
field  of  view,  crew  members  in  general  and  the  vehicle  commander  in  particular 
often  rely  on  verbal  information  from  other  members  of  the  crew  to  piece  together 
a  full  picture  of  the  surroundings. 
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Figure  3  Crew  fields  of  view  from  inside  a  HMMWV.  Each  color  represents  the  field 
of  view  from  a  crew  position.  The  mottled  appearance  is  an  artifact  of 
depth-buffer  fighting  in  areas  where  views  overlap. 
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III.  CURRENT  SOLUTIONS 


The  problems  described  in  the  first  chapter — knowledge  persistence  and 
Constrained-View  situational  awareness — have  existed  throughout  modern  warfare, 
as  can  be  seen  by  virtue  of  various  attempts  and  several  operational  systems 
acquired  in  order  to  address  them.  In  this  section,  we  describe  some  previous 
solutions  addressed  at  each  problem,  and  both  their  benefits  and  drawbacks. 

A.  KNOWLEDGE  PERSISTENCE 

Throughout  the  history  of  warfare,  there  have  been  many  ways  of  attempting 
to  deal  with  the  problem  of  providing  a  so-called  Common  Operating  Picture  (COP), 
which  is  consistent  across  the  unit  and  common  to  all  subordinate  headquarters.  The 
foundation  of  the  COP  rests  principally  on  some  sort  of  understandable  representation 
of  the  terrain  in  the  area  of  operations.  On  top  of  the  terrain  model,  a  structure  is  built 
out  of  components  representing  maneuver  elements,  area  boundaries,  target 
locations  and  other  pertinent  data.  This  COP  is  then  regularly  disseminated  and 
updated  with  the  current  picture,  which  constantly  changes  over  time.  So  far,  there 
have  been  various,  increasingly  capable  methods  for  distributing,  viewing,  saving 
and/or  organizing  this  tactical  knowledge. 

1 .  Paper  Map  Overlays 

Perhaps  the  simplest  way  of  conveying  the  operational  picture  is  a  sketch 
depicting  the  AO  and  graphic  control  measures.  Until  recently,  this  basic  method  was 
the  only  way  to  track  the  tactical  scene.  The  practice  of  using  military  maps  typically 
involves  a  base  topographic  map  with  terrain  features,  with  transparent  overlays  laid 
on  top,  aligned  via  -witness  marks.”  These  overlays  have  tactical  graphic  control 
measures  drawn  on  them,  usually  in  an  indicative  color.  Boundary  overlays  are 
drawn  using  black;  obstacle  overlays  are  usually  green;  enemy  locations  are  red  and 
so  on.  These  overlays  can  then  be  placed  on  the  map  in  various  combinations  based 
on  the  user’s  needs. 
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Another  variation  of  paper  maps  is  the  creation  of  printouts  of  digital 
products,  such  as  PowerPoint™  slides.  These  slide  printouts  have  recently  been 
the  major  way  of  getting  portable  information  to  low-level  units,  because  current 
command  and  control  systems  in  vehicles  do  not  provide  the  desired  information 
fusion. 

Advantages 

•  Persistent:  requires  no  power  source 

•  Portable:  can  be  folded  and  stuck  in  pockets 


Drawbacks 

•  Low  fidelity  and  detail:  restricted  to  one  scale 

•  Comprehensive  maps  are  physically  large  and  ungainly 

•  Immutability:  maps  cannot  be  updated  in  a  standardized  way 

•  Overlays  must  be  carefully  managed,  due  to  outdating 


2.  Sand  Table 

A  sand  table  is  a  venerable  standard  format  for  conducting  rehearsals, 
which  in  turn  provide  a  common  framework  from  which  to  operate.  A  portion  of 
ground  (preferably  sand)  is  sectioned  off,  and  a  miniature  terrain  model  is  built  of 
the  operational  plan.  (Sometimes  an  actual  table  with  walls,  filled  with  sand  is 
used,  but  this  is  mostly  in  school  environments.)  Roads,  rivers,  hills,  other  terrain 
features  and  inhabited  areas  can  all  be  portrayed  with  common  school  supplies, 
and  operational  information  can  be  written  on  cards  and  placed  around  the 
model.  Subordinate  units  are  depicted  as  well,  and  at  the  very  least  the  unit  key 
leaders  gather  around  the  model  (or  actually  stand  inside  it)  and  walk  through  the 
operation  in  miniature  (Figure  4).  This  rehearsal  method  is  a  good  way  to  ensure 
synchronization  among  subordinates.  Map  rehearsals  are  similar  to  sand  tables, 
differing  mainly  in  that  a  map  is  used  instead  of  a  dirt  model,  and  consequently 
the  number  of  participants  is  limited 
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Figure  4  Sand  table  (From  [1]) 


Advantages 

•  Relatively  simple 

•  Minimum  infrastructure  required 

•  General  familiarity  across  the  force 

Drawbacks 

•  Can  be  time-intensive  to  construct 

•  Generally  more  of  an  abstraction  than  realistic  model 

•  Requires  collocation  of  rehearsal  participants 

3.  Blue  Force  Tracking  Systems 

Blue  Force  tracking  systems  are  the  recently  fielded  digital  command  and 
control  systems  for  use  in  vehicles  and  other  battlefield  entities.  At  their  most 
basic,  they  allow  position  information  of  individual  vehicles  to  be  shared  across 

the  force,  creating  a  common  picture  of  the  locations  of  friendly  forces.  Their 
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elements  usually  include  a  vehicle  or  soldier-mounted  processing  device  and  flat- 
panel  display,  and  a  wireless  network  (usually  either  a  satellite  broadcast 
network,  or  a  peer-to-peer  mesh  network),  and  some  less  mobile  network  control 
nodes  Other  features  can  be  added  to  take  advantage  of  the  capability  provided 
by  the  network. 


a.  FBCB2/BFT 

Force  XXI  Battle  Command  for  Brigade  and  Below  (FBCB2)  [2]  and 
Blue  Force  Tracker  (BFT)  are  the  digital  communications  platforms  currently  in  use 
in  the  majority  of  U.S.  combat  vehicles.  These  two  systems  both  consist  of 
hardened/rugged  digital  computers  mounted  in  vehicles  (Figure  5)  and  connected 
to  GPS  receivers  and  wireless  communication.  They  differ  mainly  in  that  FBCB2 
achieves  connectivity  to  the  tactical  network  through  either  the  Enhanced  Position 
Location  Reporting  System  (EPLRS)  digital  radio  transceiver  (which  is  specifically 
dedicated  to  digital  connectivity)  or  the  Single  Channel  Ground  and  Airborne  Radio 
System  (SINGCARS)  standard  radio  (also  used  for  voice  communications),  while 
the  BFT  connects  to  the  network  through  a  satellite  transceiver. 
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Figure  5  FBCB2  hardware  mounted  in  a  FIMMWV  (From  [3]) 

These  two  systems  are  used  for  multiple  purposes,  which  are 
centered  on  the  concepts  of: 

•  Self-position  location  via  GPS 

•  Tracking  and  display  of  the  locations  of  other  units  with 
similar  systems,  through  a  tactical  network  through  which 
each  element  reports  and  updates  its  own  position  on  a 
periodic  basis 

•  An  top-down  view  display  to  depict  locations  and  properties 
of  all  the  connected  blue  force  elements,  aligned  with 
topographical  map  data  and/or  aerial  imagery  (Figure  6) 

•  An  overlay  system  whereby  tactical  mission  graphic  control 
measures  can  be  overlaid  on  the  topographic  data  to  depict 
boundaries,  routes  and  other  information 
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Figure  6  FBCB2  display  (From  [4]) 
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•  A  tactical  messaging  system  for  sending  various  text  reports 
to  either  one  or  multiple  elements,  as  well  as  disseminating 
graphics  overlays  which  can  then  be  displayed 
FBCB2  functional  capabilities  can  be  seen  in  Figure  7  [5], 


Area 

FBCB2  Capabilities 

Digital  Basics 

Establish  proper  communication  network 

Clear  queues  and  loqs 

Set  filters  and  respond  to  alerts 

Use  filing/ naming  conventions 

Perform  maintenance  and  troubleshooting 

Battlefield  Visualization 

Relate  threat  to  crwn/unit  location 

T ailor  situational  awareness  (SA)  picture 

Manage  Red  icons 

Post  obstacle  overlays 

Mission  Planning  & 
Preparation 

Apply  Line  of  Siqht  (LOS)  tool  for  terrain  analysis 

Apply  LOS  tool  for  perimeter  defense  planning 

Use  F  BCB2  to  plan  and  control  fire  support 

Use  FBCB2  to  support  logistical  planning/preparation 

Construct  and  update  overlays 

Leverage  FBCB2  in  multi-echelon  wargaming 

Information  Exchange 

Prepare  and  manage  messages  and  graphics 

Disseminate  messages  and  graphics 

Confirm  reception  of  critical  messages 

Mobility  &  Maneuver 

Use  FBCB2  to  plan  and  execute  movements 

Leverage  FBCB2  in  maneuver  decisions 

Exploit  FBCB2  in  fratricide  prevention 

Figure  7  Table  of  FBCB2  functional  capabilities  (From  [5]) 


Advantages 

•  The  first  widely  used  digital  blue  force  tracking  system,  in 
pervasive  use  among  all  U.S.  forces 

•  Allows  the  user  to  understand  much  more  of  the  tactical 
situation  than  was  previously  available 

•  Part  of  the  Army  Battle  Command  System  suite  of  systems, 
which  allows  lower-level  tactical  information  to  be  integrated 
into  the  higher  level  Common  Operating  Picture  (COP) 
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Drawbacks 

•  Positioning  of  blue  forces  is  not  real-time:  it  is  periodic, 
because  updates  are  sent  using  a  — hart  beat”  method  to 
allow  all  positions  to  update  on  the  network.  Additionally,  in 
practice,  the  GPS  does  not  provide  exceptional  accuracy. 

•  From  an  operator’s  perspective,  the  system  has  an  interface 
that  meets  all  specified  requirements,  but  is  awkward  for 
active  use  in  combat  situations 

•  Originally  intended  to  provide  information  dominance  on  a 
high-intensity  combat  battlefield:  suitable  for  maneuver 
warfare,  but  lacks  fidelity  or  versatility  for  urban  COIN 
operations. 

b.  Tacticomp 

Tacticomp™  (see  Figure  8  )  is  a  system  produced  by  Sierra  Nevada 
Corporation  [6]  that  combines  many  functions  provided  by  FBCB2,  as  well  as  other 
functions  such  as  video  streaming  capability  and  file  sharing.  It  has  been  test- 
fielded  to  some  units  in  theatre,  but  has  not  been  acquired  on  a  large  scale. 

Advantages 

•  Provides  many  of  the  same  functions  as  FBCB2 

•  Allows  flexible  interface  for  users  to  share  more  ambiguous 
data,  such  as  on-the-fly  sketches  and  images 

•  Runs  on  the  Windows  operating  system,  which  greatly 

reduces  the  learning  curve  for  soldiers  already  familiar  with 
such  systems 

•  Dismountable 
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Figure  8  Tacticomp  6  tablet  (From  [6]) 

Drawbacks 

•  Not  fielded  in  large  numbers,  so  the  mesh  network  involved 
is  not  very  robust 

•  Also  limited  to  2D  depictions  of  the  battlespace 

4.  Web-Based  Tactical  Information  Assets 

With  the  proliferation  of  computing  and  networking  technology,  the  basic 
Web  browser  can  be  used  as  a  device  for  a  shared  operational  picture. 
Numerous  databases  of  tactical  information  can  be  connected  via  server-side 
software,  and  accessed  on  the  network  by  dispersed  users  using  Web  page 
interfaces.  These  information  sources  can  be  scaled  well,  and  can  be  updated 
as  necessary  on  the  server  side,  rather  than  requiring  hardware  or  client  software 
updates.  These  online  repositories  can  provide  a  much  greater  depth  and 
breadth  of  information  to  the  user,  as  opposed  to  the  currently  fielded  mobile 
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systems.  However,  they  also  consume  bandwidth  that  might  not  be  feasible  over 
current  tactical  networks. 

a.  TiGRnet 

The  Tactical  Ground  Reporting  system  (TiGRnet)  [7]  is  a  program 
spawned  from  DARPA  that  found  great  success  in  current  operations.  It  is 
essentially  a  GIS  Web  service  (see  Figure  9  ),  which  allows  small  tactical  units  to 
compile,  spatially  relate  and  share  numerous  types  of  relevant  information  in  a 
dispersed  manner.  The  system  involves  a  server  architecture  that  allows  units  to 
establish  their  own  local  system  that  is  simultaneously  connected  to  the  rest  of  the 
TiGR  network. 


Click  for  the  rest  of  this  Sumi 


a*101"'"?  [slaughter  I 

the  words  1 - 1 

i\  Report:  Trip  Ticket  #1234/07NOV07/  11T  Johnson/ 

Fie  Edit 

dumi  vote,  u/  i*:>u  nov  u/ 

End  Dote:  07  1800  Nov  07  (2-6) 

Unit:  cyi-7  CAV/1BCT/1  CD 


With  Any  Category  (click  to  change) 


In  Any  Unit  (click  to  change) 


Warrior.  Blue  1  Recon  Patrol 
07  1200  Nov  07  (Z-6) 

Visit  to  Dr.  Shalim  Ramam 
07  1430  Nov  07 

Khalkf  Nassir.  Local  National.  Al  I 
07  1242  NOV  07  (Z-6) 

FP5  Captures  Rocket  Man 
07  1310  Nov  07  (Z-6) 

EFPIED 

07  1515  Nov  07  (Z-6) 


38S  MB  41168  828341 


38S  MB  42843  83026 


38S  MB  40918  82345 


Report  Summary: 

Task:  Patrol 

Purpose:  Provide  route  security  ond  presence  patrol  in  the  AO. 


38S  MB  4222  8372 

33°  17'  28"  H.  44°  22'  46"  E 


Configure:  Layers  Imagery  Pr 


Screenshot  from  DARPA’s  TIGR  (Tactical  Ground  Reporting)  System  (synthetic  data).  TIGR  is  a  multimedia  reporting  system  for  soldiers  at  the  patrol  level, 
allowing  users  to  collect  and  share  information  to  improve  situational  awareness  and  to  facilitate  collaboration  and  information  analysis  among  junior  officers. 
DARPA  Image 
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Figure  9  TIGR  large-scale  view  (From  [8]) 

The  core  TIGR  service  involves  a  map  interface,  which  incorporates 
the  capability  to  access  many  layers  of  information.  Units  can  upload  pictures, 
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video  and  documents,  and  associate  them  spatially  with  particular  locations  and/or 
individuals.  Units  can  do  a  walkthrough  of  routes  they  are  planning  to  take,  or 
locations  where  they  intend  to  operate,  and  access  any  pertinent  information  about 
locations  and  sites  that  they  may  pass  or  transit.  This  allows  much  greater 
contextual  understanding  of  the  upcoming  mission  environment,  and  the  data  can 
also  be  integrated  with  other  systems  for  intelligence  analysis. 

Advantages 

•  Allows  integration  and  sharing  of  numerous  forms  of 
pertinent  information 

•  Web  service  model  allows  for  easier  configuration 

management 

•  Allows  spatial  contextualization  of  information 

Drawbacks 

•  Not  currently  mobile:  units  do  not  have  access  during 

operations,  but  only  back  at  a  fixed  site  with  connectivity, 
thus  limiting  use  to  pre-  and  post-operation  periods. 

•  2-D  map  based  on  aerial  imagery  does  not  permit  distinction 
of  height-off-the-ground  as  might  be  of  importance  to 
ground-based  forces.  This  limits  fidelity,  immersion  and 
presence 

b.  Buckeye 

Buckeye  is  the  name  of  a  product  from  the  Army  Geospatial  Center 
(AGC)  [9]  that  provides  high-resolution  overhead  imagery  of  numerous  locations 
throughout  the  theater  of  operations.  These  images  are  commonly  placed  into 
PowerPoint  slides,  and  have  operational  graphics  drawn  upon  them.  These 
images  provide  a  greatly  increased  sense  of  the  area  being  viewed,  compared  to 
standard  topographical  maps. 
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Figure  10  Buckeye  View  (After  [10]) 

Advantages 

•  High-resolution  aerial  imagery 

•  Simple  interface 

Drawbacks 

•  Images  sometimes  taken  at  oblique  angles 

•  Limited  or  no  detail  of  vertical  surfaces 

c.  Project  Tourist 

Project  Tourist  [11]  is  another  AGC  service  that  incorporates 
spherical  video  of  urban  areas  synced  to  a  top-down  map  view  that  allows  the  user 
to  select  routes  to  view.  These  routes  can  then  be  viewed  as  a  virtual  tour,  with 
the  map  showing  the  top-down  location,  and  the  video  or  panoramic  still  frame 
showing  the  surroundings  at  that  point.  This  service  is  very  similar  to  Google 
StreetView™  [12],  but  provides  data  of  areas  in  the  active  theater  of  operations. 
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Advantages 

•  High-Resolution  Street-level  panoramic  imagery  and  video 

•  Provides  multiple  angle  views  of  street-level  features 

Drawbacks 

•  Collecting  capability  not  yet  distributed 

•  Data  can  be  out  of  date 

•  Currently  no  depth  data  on  the  video,  limiting  the  geo-spatial 
correspondence  between  the  spherical  view  and  the  top- 
down  map 

•  Opportunities  for  confusion 

d.  SharePoint ™  and  Web  Portals 

A  common  method  for  documenting  and  storing  tactical  knowledge  is 
by  using  office  software  (usually  Microsoft  Office™)  to  generate  documents,  which 
are  then  saved  on  the  tactical  network.  These  products  can  span  all  the  way  from 
text-only  documents  to  complex  multimedia  presentations.  Once  they  are 
constructed,  these  documents  can  be  shared  for  collaboration  purposes  via  Web 
portals  on  the  tactical  internet. 

Advantages 

•  Allows  detailed  documentation 

•  Existing  familiarity  across  the  force 

Drawbacks 

•  Currently  must  be  printed  out  to  be  taken  on  operations 

•  File  size  becomes  quite  large  with  added  detail:  long 
transmission  lag 
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5. 


Serious  Games 


Figure  1 1  U.S.  Army  cadets  participating  in  game-based  training  (From  [13]) 


Some  units,  on  their  own,  have  made  inroads  into  the  use  of 
commercial  first-person  simulation  games  (such  as  ArmA  2  [14])  for  rehearsal 
purposes.  The  U.S.  Army  and  USMC  have  recently  adopted  a  similar  system, 
Virtual  Battlespace  2  [15]  as  an  official  gaming  platform.  This  can  be  an  effective 
means  of  rehearsing  an  operation. 


Advantages : 

•  Allows  visualization  of  the  actual  mission 

•  If  networked,  allows  a  much  more  realistic  rehearsal  than 
other  methods,  and  thus  better  cognitive  absorption  of  the 
operational  environment. 
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Drawbacks 


•  Requires  digital  terrain,  which  may  be  time-consuming  to 
build 

•  Requires  hardware  and  software  for  each  participant,  which 
is  not  usually  available 

•  Not  currently  configured  to  generate  game  objects  from 
battle  command  system  data 

6.  Analysis 

The  identified  approaches  to  addressing  the  Knowledge  Persistence  problem 
can  be  analyzed  and  relative  strengths  and  weaknesses  compared,  in  order  to 
develop  a  more  satisfactory  solution.  For  each  current  solution,  we  have  assessed 
the  relative  strength  of  four  characteristics  appropriate  to  the  domain. 

a.  Terrain  View 

This  attribute  is  a  rating  of  the  ability  of  the  solution  to  provide  a 
detailed,  realistic  view  of  the  terrain  in  the  operational  environment  in  question,  in 
which  contextual  information  can  be  situated. 

b.  Available  On-the-Move 

This  attribute  is  a  rating  of  the  ability  for  the  solution  to  be  utilized 
while  in  the  operational  environment,  in  a  moving  vehicle. 

c.  Data  Updatability 

This  attribute  rates  the  ease  with  which  the  system  can  update, 
change  and  disseminate  new  information. 

d.  Placement  of  Spatial-Contextual  Information 

This  attribute  is  a  rating  of  the  degree  to  which  the  solution  provides 
the  ability  to  view  information  in  its  spatial/situational  context,  in  order  to  enhance 
the  user’s  understanding. 
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For  each  of  these  described  attributes,  we  have  assessed  each 
identified  solution  on  a  scale  from  1  to  5  ,  with  1  being  -very  strong”  and  5  being 
— ®ry  weak,”  as  displayed  in  Table  1  . 

Table  1  Solution  assessment-Knowledge  persistence 


Problem:  Knowledge 

Persistence 

Assessment 

Solutions 

Terrain 

View 

Available  On- 

The-Move 

Data 

Updatability 

Spatial- 

Contextual  Info 

Placement 

Paper  Maps  w/  Overlays 

4 

2 

5 

4 

Sand  Table 

3 

5 

4 

4 

Blue  Force  Tracking 
Systems 

3 

1 

3 

4 

Web-Based  Tactical 

Information  Assets 

3 

5 

2 

3 

Serious  Games 

2 

5 

4 

3 

Upon  reviewing  our  subjective  assessments,  one  can  see  that  none 
of  the  solutions  is  particularly  effective  across  all  attributes,  and  none  have  more 
than  one  attribute  scored  above  St,"  or  -fair.”  Since  it  is  our  intent  to  solve  the 
Knowledge  Persistence  problem  in  a  more  satisfactory  manner,  it  is  important  that 
our  developed  solution  show  improvement  across  our  identified  attributes. 

B.  CONSTRAINED-VIEW  SITUATIONAL  AWARENESS 

Aside  from  maintaining  a  COP  and  persisting  the  knowledge  it  contains, 
operating  forces  must  be  acutely  aware  of  their  immediate  surroundings  and  observe 
and  process  the  environment  and  situation.  This  ability  to  generate  situational 
awareness  becomes  problematic  through  the  restriction  of  view  of  vehicle 
crewmembers  due  to  necessary  armor  requirements. 
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1. 


Human  Gunner-Observer 


To  alleviate  the  visibility  problem,  most  combat  vehicles  employ  a  gunner  in  a 
turret  position  atop  the  vehicle,  to  perform  two  tasks:  engage  targets  with  a  direct-fire 
weapon,  and  provide  visibility  around  the  vehicle.  The  latter  task  is  much  more 
prevalent  than  the  former.  Because  of  this  need  to  see,  gunners  must  have  visibility, 
which  conflicts  with  survivability:  gunners  are  by  far  the  most  vulnerable  member  of 
vehicle  crews.  Their  position  makes  them  vulnerable  to  small  arms  fire  and  I  ED 
explosions.  Additionally,  because  of  the  unwieldy  nature  and  high  center  of  gravity  of 
heavily  armored  wheeled  combat  vehicles,  rollovers  are  relatively  common,  and 
gunners  are  very  vulnerable  in  such  situations. 

HMMWV  gunners  (see  Figure  12)  historically  have  had  a  high  casualty  rate  in 
combat.  Because  of  this,  there  has  been  a  focused  effort  made  to  mitigate  this 
vulnerability:  first  with  turret  armor,  and  then  armored  glass  was  added  to  turrets  to 
protect  against  small  arms  and  fragments.  Also,  gunners  have  been  given  harnesses 
and  tie-downs  in  order  to  prevent  them  from  being  thrown  from  the  vehicle  during 
rollovers.  The  Army  has  gone  so  far  as  to  develop  an  armored  suit  for  gunners  to 
wear,  similar  to  the  -bomb  suits”  worn  by  explosive  ordnance  technicians  (see  Figure 
13  [16]). 
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Figure  12  A  gunner  in  a  HMMWV  turret 


Advantages 

•  Gunners  have  a  far  better  field  of  view  than  crew  in  the 
vehicle 

•  They  can  engage  targets  with  either  lethal  or  non-lethal 
weapons  as  appropriate 

•  Gunners  provide  the  advantage  of  being  able  to 
communicate  with  the  local  civilian  traffic  via  hand  and  arm 
gestures  in  order  to  convey  intent  and  commands:  this  helps 
prevent  misunderstandings  and  escalation  of  force  incidents 


Drawbacks 

•  Gunners  are  vulnerable  to  small  arms  fire 

•  Gunners  have  less  protection  from  explosions  than  the  rest 
of  the  crew 

•  Gunners  get  thrown  from  vehicles,  pinned  and/or  crushed 
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•  Measures  necessary  to  improve  survivability  for  gunners 
result  in  degradation  of  other  vehicle  characteristics:  vehicles 
develop  a  higher  center  of  gravity,  and  unwieldy  protective 
apparel  and  safety  measures  create  difficult  conditions  for 
gunners.  [18] 


Figure  13  Cupola  Protective  Ensemble  (CPE)  for  gunners  (From  [17]) 

2.  Remote  Weapon  Stations 

Recently,  remote  weapons  stations  (ex.  Figure  14  )  [18]  have  become  more 

prevalent  and  widespread.  These  systems  are  essentially  a  remote-controlled 

weapon  mounted  atop  a  vehicle  that  can  be  aimed  and  fired  by  an  operator  inside  the 
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vehicle,  viewing  the  target  through  electro-optics.  These  are  in  use  on  almost  every 
one  of  the  Stryker  combat  vehicle  variants;  are  being  mounted  on  HMMWVs  and 
MRAPs;  and  are  even  being  incorporated  into  the  Tank  Urban  Survival  Kit,  an  add-on 
kit  for  the  M1A2  Abrams  tank.  These  allow  the  gunner  to  stay  inside  the  relative 
protection  of  the  vehicle  while  being  able  to  engage  targets  with  high  precision. 
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Figure  15  USAF  airman  demonstrating  the  CROWS  weapon  control  station  inside  a 

HMMWV  (From  [20]) 


Advantages 

•  Better  protection  for  the  gunner 

•  Enhancing  imaging  capabilities,  including  thermal  optics 

•  Much  more  precise  target  engagement  and  stabilization 
method 


Drawbacks 

•  Mechanical  malfunctions  more  common 

•  Gunner  has  limited  FOV  at  any  one  time  (-Soda  Straw” 
effect) 

•  Vehicle  Commander  does  not  have  override  or  view 
capability 
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3. 


See-Through  Turret 


The  U.S.  Army  has  experimented  in  the  recent  past  with  the  concept  of  the 
-See-Through  Turret.”  The  system  involves  mounting  cameras  around  the  outside  of 
a  tank  or  other  combat  vehicle,  with  interior  displays  for  the  crew  members  to  view  the 
entire  surroundings  of  the  vehicle,  with  no  or  few  blind  spots  [21]  This  has  not 
progressed  past  the  prototype  stage,  although  the  CROWS  program  management 
has  expressed  interest. 

Advantages 

•  Crewmembers  can  view  the  entire  surroundings  of  the 
vehicle  simultaneously,  improving  SA 

Drawbacks 

•  Display  methods  have  been  troublesome:  HMDs  and  flat 
displays  have  been  tried,  but  with  difficulties 

•  Crewmembers  are  often  busy  with  other  tasks  (loading  the 
main  gun,  driving,  engaging  targets)  which  makes  additional 
information  difficult  to  handle 

4.  Analysis 

As  in  the  Knowledge  Persistence  (K  P)  problem,  these  identified  approaches 
to  addressing  the  Constrained  View  Situational  Awareness  problem  can  be  analyzed, 
and  relative  strengths  and  weaknesses  again  compared.  In  this  case,  for  each 
current  solution,  we  have  assessed  the  relative  strength  of  four  attributes: 

a.  Crew  Protection 

This  attribute  is  a  rating  of  the  additional  protection  added  to  the 
vehicle  crew  by  the  application  of  the  rated  solution. 


30 


b. 


Vehicle  Commander  Visibility 


This  is  a  rating  of  the  overall  visibility  of  the  surrounding  environment 
provided  by  the  system  to  the  vehicle  commander  using  the  system,  which  is 
critical  to  the  SA  of  the  crew  in  general. 

c.  Weapon  System  Integration 

This  attribute  rates  the  degree  to  which  the  vehicle’s  weapon 
systems  are  integrated  with  the  situational  awareness  solution  (that  is,  the  ease 
with  which  the  crew  can  identify  and  engage  a  valid  target). 

d.  Placement  of  Spatial-Contextual  Information 

As  in  the  K  P  problem,  this  attribute  is  a  rating  of  the  degree  to  which 
the  solution  provides  the  ability  to  view  information  in  its  spatial/situational  context, 
in  order  to  enhance  the  user’s  understanding. 

Again,  for  each  of  these  described  attributes,  we  have  assessed 
each  identified  solution  on  a  scale  from  one  to  five,  with  1  being  — ®ry  strong”  and 
5  being  — ®ry  weak,”  as  displayed  in  Table  2  . 
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Table  2  Solution  assessment-Constrained  view  situational  awareness 


Problem:  Knowledge 
Persistence 

Assessment 

Solutions 

Terrain  View 

Available  On- 

The-Move 

Update 

Frequency 

Spatial- 

Contextual  Info 

Placement 

GIG 

Integration 

Paper  Maps  w/ 
Overlays 

4 

2 

5 

4 

5 

Sand  Table 

3 

5 

4 

4 

5 

Blue  Force  Tracking 
Systems 

3 

1 

2 

4 

3 

Web-Based  Tactical 

Information  Assets 

3 

5 

2 

3 

1 

Similarly  to  the  previous  problem,  upon  reviewing  our  subjective 
assessments,  one  can  see  that  none  of  the  solutions  is  particularly  effective  across 
all  attributes,  and  high  scores  in  some  attributes  tend  to  be  balanced  by  poor 
scores  in  other  attributes.  Since  it  is  our  intent  to  also  solve  the  Constrained  View 
Situational  Awareness  problem  in  a  more  satisfactory  manner,  it  is  important  that 
our  developed  solution  shows  improvement  across  our  identified  attributes. 

Some  desirable  improvements  to  the  current  status-quo  will  be 
addressed  in  our  prototype  system,  some  of  which  include: 

•  360-degree  visibility  for  TC  (and  perhaps  others) 

•  Precise  weapon  fire  control  by  vehicle  commander 

•  Nonobstructive/obtrusive 
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IV.  SOLUTION  DEVELOPMENT 


In  our  search  for  better  solutions  to  the  two  identified  operational  problems  of 
Knowledge  Persistence  and  Constrained-View  Situational  Awareness,  we  propose 
that  a  technical  development  known  as  Augmented  Reality  (AR)  has  the  potential  to 
address  both  problems  simultaneously  in  one  combined  solution.  Prior  to  discussing 
our  findings  on  this  topic,  we  must  first  provide  a  background  overview  of  this 
technology  to  be  investigated. 

A.  AUGMENTED  REALITY 

Augmented  Reality  [22]  is  the  imposition  of  spatially-registered  computer 
graphics  over  a  live  image  of  the  real  world,  be  it  a  video  feed  (known  as  video  see- 
through  AR)  or  a  direct  view  (known  as  optical  see-through  AR).  The  essential 
characteristic  of  AR  is  spatial  registration:  simply  imposing  text  or  other  iconography 
over  the  live  image  does  not  make  a  system  qualify  as  augmented  reality.  By  spatial 
registration,  we  mean  that  the  augmentations  move  with  the  view:  that  is,  the 
generated  graphics  behave  visually  as  if  they  were  located  at  an  actual  point  in  space. 
Augmented  reality  is  a  part  of  a  so-called  -reality/virtuality  continuum”  [23],  as  seen  in 
Figure  16  ,  with  -actual”  reality  on  the  left,  and  fully  simulated  or  virtual  reality  on  the 
right. 
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Figure  16  Milgram’s  Virtuality  Continuum 


B.  AR  REQUIRED  CAPABILITIES 

1 .  Determine  Pose  of  Point  of  View 

In  order  to  effectively  augment  reality  (as  well  as  to  provide  several  other 
desired  capabilities),  we  must  determine  both  the  location  and  orientation  (or  pose)  of 
the  viewer,  in  order  to  register  the  generated  augmentations  with  the  physical  world. 
Registration  is  critical  in  AR:  registration  error  causes  a  cascade  of  problems, 
including  erroneous  icon  placement,  movement  of  annotations,  and  general 
inaccuracy  of  data.  For  this  reason,  we  must  seek  to  register  the  user  point  of  view  as 
accurately  and  precisely  as  possible.  This  requirement  can  be  addressed  in  various 
ways. 


a.  Degrees  of  Freedom 

When  discussing  registration,  the  key  concept  involved  is  degrees  of 
freedom  (DOF).  A  degree  of  freedom  (in  mechanics)  is  a  displacement  or  rotation 
of  a  body  or  physical  system:  DOFs  of  a  body  are  the  set  of  these  that  specify 
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completely  the  displaced  position  and  orientation  of  a  body.  This  can  be 
generalized  as:  a  rigid  body  in  d  dimensions  has  d (d+lj  degrees  of  freedom  (d 

translations  and  d(d^11  rotations).  The  3-dimensional  space  we  inhabit  is  6  DOF:  3 
degrees  of  translation,  and  3  of  rotation. 

(1)  Position.  The  easiest  way  to  describe  translations  is  in 
Cartesian  coordinates:  [X,Y,Z],  where  X,  Y  and  Z  are  axes  with  one  degree  of 
freedom  each,  and  are  orthogonal  to  each  other.  However,  this  only  applies  at 
local  scales.  If  we  are  describing  coordinates  in  a  global  sense,  we  will  come  upon 
a  problem:  that  the  Earth  is  round.  If  we  start  at  a  point  on  the  equator,  and  move 
90  degrees  of  longitude  to  the  west,  around  the  globe,  -down,”  which  previously 
was  a  distance  in  the  -Z  direction,  is  now  actually  a  distance  in  the  previous  X 
direction.  This  fact  comes  into  play  when  we  are  describing  things  on  a 
geographic  scale,  and  because  of  it,  the  coordinate  system  commonly  used  in 
georegistration  uses  the  measurements  known  as  longitude,  latitude,  and  altitude. 
These  are  spherical  coordinates,  with  longitude  being  measured  in  degrees  of 
rotation  around  the  earth’s  axis,  latitude  being  measured  in  degrees  of  rotation 
from  the  equator  toward  one  of  the  Earth’s  poles;  and  altitude,  which  is  commonly 
measured  in  feet  or  meters  above  (or  below)  sea  level.  Altitude  is  not  as  simple  as 
lat-long,  because  the  distance  from  the  center  of  the  earth  to  a  standard  sea  level 
varies  dependent  on  where  you  are  located:  the  Earth  is  not  a  perfect  sphere,  but 
instead  resembles  an  ellipsoid.  A  base  reference  model  of  this  imperfection  is 
known  as  a  datum,  or  standardized  model,  which  is  then  normalized  as  sea  level, 
or  zero  altitude.  The  conventional  global  standard  for  navigation  is  known  as  the 
World  Geodetic  System  1984,  or  WGS  84.  This  is  the  datum  used  in  the  Global 
Positioning  System  (GPS). 

(2)  Tracking.  For  tracking  in  AR,  we  must  keep  the 
Cartesian/Geographic  coordinate  distinction  in  mind,  and  must  be  able  to  convert 
between  the  two.  Orientation  is  commonly  expressed  as  degrees  of  rotation  (three, 
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in  our  case).  These  degrees  are  usually  expressed  as  Euler  angles,  which  are 
rotations  about  each  of  the  three  translational  axes.  A  specific  type  of  Euler 
angles,  known  as  Tait-Bryan  angles,  are  known  by  aviators  as  — ¥w,  Pitch  and 
Roll”:  these  each  indicate  a  body’s  rotation  around  its  own  Z,  Y  and  X  axes, 
respectively. 


Orientations  are  also  subject  to  frames  of  reference,  whether 
we  are  referring  to  global  or  local  rotations.  In  the  case  of  AR,  orientation  is 
usually  taken  to  mean  rotation  about  the  axes  of  latitude,  longitude,  and  altitude, 
with  respect  to  the  geographic  datum. 

(3)  Pose.  Pose  is  a  term  indicating  the  combination  of 
translation  and  orientation,  to  form  a  representation  of  all  six  degrees  of  freedom  of 
an  object. 


b.  Types  of  Tracking  and  Registration 

Tracking  and  registration  are  two  sides  of  the  same  coin.  Tracking  is 
the  process  of  identifying  the  pose  of  external  objects,  based  on  the  knowledge  of 
one’s  own  position,  while  registration  can  be  looked  at  as  determining  one’s  own 
position,  based  on  external  stimuli.  There  are  various  ways  of  accomplishing  both, 
as  follows. 


(1)  Fiducial  Marker  Tracking.  In  this  method,  a  camera  is 
used  that  captures  a  video  stream  of  the  real  world.  Fiducial  markers  (such  as 
seen  in  Figure  17  )  are  then  placed  in  the  environment,  and  their  pose  is 
determined  using  computer  vision  techniques. 
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Figure  17  Example  of  a  fiducial  marker 


When  a  marker  is  seen  by  the  camera,  computer  vision 
techniques  are  used  to  recognize  the  marker,  which  then  allows  augmentations  to 
be  placed  relative  to  the  marker’s  position.  There  are  numerous  software  libraries 
available  to  implement  this  method:  ARToolkit  [24],  ARTag  [25]  and  StudierStube 
[26]  are  perhaps  the  most  popular. 

Degrees  of  Freedom:  This  method  allows  full  6DOF 
calculations,  as  long  as  the  markings  are  visible  to  the  camera:  accuracy  increases 
with  an  increase  in  size  of  the  marker  or  decreased  distance  to  the  marker, 
because  either  of  these  conditions  effectively  increases  the  resolution  of  the 
marker  to  the  camera. 

Advantages:  One  feature  that  could  find  military  application 
is  to  place  markers  on  vehicles  as  both  an  Identify  Friend  or  Foe  (IFF)  aid  and  a 
-barcode  hyperlink,”  which  would  allow  the  vehicle  to  be  identified  by  the  system 
and  be  tracked  automatically,  as  long  as  it  remained  in  sight. 

Limitations:  While  this  method  is  suitable  for  many  AR 
applications,  military  uses  are  limited  because  the  markers  must  be  preplaced:  this 
requirement  makes  annotation  of  a  large  urban  environment  difficult. 
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(2)  Markerless  Vision  Tracking.  The  Markerless  Vision 
Tracking  method  also  uses  a  camera,  but  no  markers  are  placed.  Computer  vision 
methods  are  used  to  locate  natural  features,  and  calculate  the  camera’s  position 
based  on  optical  flow  and  other  characteristics.  This  method  has  the  benefit  of  not 
requiring  markers,  but  is  computationally  intensive.  There  are  several  ways  to 
implement  markerless  tracking:  some  include  using  models  of  the  surroundings, 
which  simplifies  the  task.  Others  use  techniques  to  generate  a  model  from  the 
video  itself.  ARToolkit  Natural  Feature  Tracking  [27]  is  one  library  that  attempts  to 
implement  pose  estimation  without  the  use  of  markers  prepositioned  in  the 
environment. 


Degrees  of  Freedom:  The  Markerless  Vision  Tracking 
method  can  also  determine  all  6DOF. 

Advantages:  Visual  feature  tracking  has  the  advantage  of  not 
requiring  pre-annotation  or  markup  prior  to  use:  these  systems  can  be  easily  used 
in  new  environments. 

Limitations:  Natural  feature  tracking  is  computationally 
intensive.  Also,  it  is  susceptible  to  changes  in  the  lighting  environment,  such  as 
changes  in  contrast  or  brightness. 

(3)  Model-Based  Tracking.  Model-based  tracking  (MBT)  [28], 
[29]  is  related  to  natural  feature  tracking,  in  that  features  in  video  are  also  tracked. 
However,  in  this  case,  we  create  a  3D  model  of  the  environment  beforehand.  We 
can  render  the  model,  and  compare  it  to  the  video.  Given  a  particular  image  from 
the  video,  a  most-probable  self-location  can  be  calculated  by  determining  the  spot 
where  the  model  and  video  are  most  similar. 

MBT  obviously  requires  that  we  construct  the  model 
beforehand.  The  model  could  be  constructed  manually,  using  3D  modeling 
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software,  or  automatically,  if  we  could  automatically  capture  the  texture  and 
structure  of  an  urban  environment  and  transform  it  into  a  model 

Degrees  of  Freedom:  We  can  track  in  6DOF  using  the 
model-based  tracking  method. 

Advantages:  The  model-based  tracking  method  combines 
some  of  the  advantages  of  both  marker  and  markerless  tracking:  like  marker- 
based  tracking,  it  has  the  advantage  of  prior  knowledge  of  dimensions  of  features 
being  tracked.  Also,  like  natural  feature  tracking,  it  does  not  require  any  actual 
external  infrastructure  (this  having  been  replaced  by  the  model) 

Limitations:  MBT  requires  an  accurate  model  for  good 
performance:  an  inaccurate  model  has  adverse  effects  on  positioning,  because  the 
probabilities  are  reduced.  Urban  terrain  changes  with  time,  so  the  model  must  be 
updated  frequently.  And,  construction  of  a  model  is  nontrivial. 

(4)  Inertial  Tracking.  In  this  method,  various  sensors  (to 
include  accelerometers  and  gyroscopes)  are  used  to  detect  changes  in  orientation 
and  translation,  by  integrating  the  acceleration  over  time.  These  techniques  have 
the  benefit  of  not  depending  on  any  external  signal,  other  than  gravity  and  inertia. 
They  have  the  disadvantage,  however,  of  drifting  over  time;  this  drifting  requires 
additional  tracking  means  to  recalibrate  the  inertial  sensor. 

Compasses  are  similar  to  inertial  trackers,  in  that  they 
measure  the  direction  of  an  acceleration  (in  this  case  a  force  caused  by  the  Earth’s 
magnetic  field),  which  presumably  aligns  north-south. 

Degrees  of  Freedom:  Inertial  sensors  are  limited  by  the 
physical  properties  of  the  type  of  sensor  used.  Most  are  meant  to  sense 
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orientation:  gyroscopes,  for  example.  Others  sense  acceleration,  both  rotational 
and  translational:  if  we  integrate  twice  over  the  accelerations,  we  can  get  changes 
in  pose,  for  a  full  6DOF. 

Advantages:  Inertial  sensors  require  no  external  signal,  so 
they  can  be  used  in  almost  any  environment. 

Limitations:  These  sensors  are  limited  by  several  factors, 
primarily  vibration  and  drift.  Vibration  introduces  noise  into  the  system,  which  can 
skew  measurements.  And  most  inertial  sensors  drift  over  time,  so  that  their 
internal  reference  coordinates  differ  from  those  of  the  real  world.  Because  of  this, 
inertial  sensors  tend  to  be  combined  with  other  methods,  in  order  to  -recalibrate” 
these  reference  coordinates  periodically. 

In  the  case  of  a  compass,  magnetic  fields  can  be  generated 
by  things  other  than  the  Earth,  and  magnetic  objects  can  skew  the  instrument. 

(5)  External  Signal  Tracking.  External  Signal  Tracking  (EST) 
involves  reception  of  external  signals  that  provide  pose  information.  One  example 
is  the  Global  Positioning  System  (GPS):  the  constellation  of  GPS  satellites  orbiting 
the  Earth  send  out  very  precise  timing  signals.  Because  we  know  the  location  of 
the  satellites,  we  can  compare  our  local  time  with  the  time  encoded  in  the 
transmission  from  each  satellite.  From  these  timing  differences,  we  can  calculate 
the  intersection  of  all  the  spheres  centered  at  each  satellite,  with  a  radius  equal  to 
the  speed  of  light  multiplied  by  the  timing  difference  to  that  satellite.  That 
intersection  point  is  our  current  location. 

The  case  of  GPS  is  different  from  other  tracking  methods:  its 
primary  purpose  is  to  measure  translation,  and  pure  GPS  does  not  measure 
orientation.  However,  this  limitation  can  be  remedied  by  making  some 
assumptions:  mainly,  that  a  vehicle  tends  to  point  in  the  direction  of  its  own 
movement.  For  land  vehicles,  this  usually  is  a  reasonable  assumption.  If  we  make 
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this  assumption,  then  we  can  track  GPS  points  over  time,  and  use  the  orientation 
of  the  vector  between  the  points  as  the  orientation  of  the  vehicle.  However,  this 
technique  is  restricted  to  estimating  pitch  and  yaw.  Any  degree  of  roll  could  be 
valid,  because  we  are  assuming  we  are  moving  along  the  X  axis. 

Advantages:  GPS  tracking  is  available  for  most  places  on 
Earth,  and  it  does  not  require  any  prior  knowledge  of  the  environment.  It  can  also 
be  highly  precise,  if  additional  technologies  such  as  Differential  GPS  are  used. 

Limitations:  Since  GPS  relies  on  electromagnetic  (radio) 
transmissions,  it  can  be  susceptible  to  interference,  and  it  suffers  the 
aforementioned  limitations  to  measuring  mainly  translation. 

(6)  Hybrid  Methods.  As  mentioned,  all  of  the  common 
tracking  methods  suffer  from  one  or  more  limitations.  However,  they  can  be 
combined  in  various  ways  to  greatly  improve  performance.  For  instance,  several 
applications  have  been  developed  for  the  Apple  iPhone®  3GS  that  implement  AR- 
type  capabilities.  These  applications  combine  the  native  sensors  on  the  phone 
(GPS,  gravitational  accelerometer,  and  compass)  with  video  tracking  using  the 
phone’s  camera  to  provide  registration.  Google’s  Android™  phone  operating 
system  also  has  multiple  applications  in  a  similar  vein.  These  are  simple 
applications,  on  small  mobile  devices,  but  demonstrate  great  potential  for  the 
fusion  of  sensor  data. 

In  a  larger  format,  there  are  several  INS  products  available 
that  remedy  the  noted  limitations  of  inertial  sensors  by  updating  the  system  with 
GPS  data,  in  order  to  avoid  drift  issues. 

2.  Display  View 

A  second  component  of  Augmented  Reality  is  the  view  of  the  world,  which  has 
annotations  inserted  into  it.  Since  AR  must  combine  the  real  world  with  generated 
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images,  there  is  a  complex  set  of  characteristics  in  the  interplay  between  the  technical 
system  and  the  human  user  that  determines  whether  or  not  displays  are  suitable  for 
tactical  use. 


a.  Characteristics  of  the  Human  Eye 

In  order  to  evaluate  the  characteristics  of  displays,  let  us  consider 
some  anatomical  and  functional  characteristics  of  the  human  eye.  The  human  eye 
has  resolution  characteristics  as  shown  in  Table  3  [30]: 


Table  3  Visual  resolution  characteristics  of  the  human  eye 


Characteristic 

High 

Low 

degrees/pixel 

0.02 

0.03 

pixels/degree 

50 

33 

num  pixels/360o 

18,000 

12,000 

radius  (radial  pixels) 

2,864 

1,909 

Area  (square  pixels)/  Visual  sphere 

105,246,320 

45,795,386 

Area  (megapixels)/  Visual  sphere 

105 

46 

As  Figure  18  illustrates,  these  metrics  indicate  a  -pixel  size”  for  the 
eye  is  approximately  1 .2-1 .8  arc-minutes  or  0.02-0.03  degrees. 
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Figure  18  Human  eye  resolution 


From  these  rough  measurements,  we  can  see  that  the  order  of 
magnitude  of  the  area  of  the  visual  field  of  a  human,  in  terms  of  square  resolution 
units,  is  near  108  square  -pixels.”  This  is  not  actually  a  very  accurate  figure,  since 
it  approximates  taking  one’s  eyes  and  scanning  the  fovea  of  the  eye  over  every 
patch  of  an  imaginary  sphere  centered  at  one’s  head,  but  it  gives  a  rough  order  of 
magnitude.  To  get  an  idea  this  resolution,  one  might  surround  oneself  with  10 
WXGA+  LCD  monitors  edge-to-edge  in  a  circle  (10  times  1440  horizontal 
resolution). 


That  rough  order  of  magnitude  is  for  an  entire  sphere:  humans  do 
not  see  in  a  panoramic  fashion.  Figure  19  [20]  shows  the  typical  overlapping 
binocular  field  of  view  for  an  average  person.  The  center  of  the  diagram  indicates 
the  center  of  the  composited  field  of  view  for  both  eyes:  the  concentric  circles 
indicate  the  angular  displacement  from  that  center,  from  0-90  degrees  off-center, 
in  all  directions.  The  radial  lines  indicate  the  direction  of  the  angular  displacement. 
The  white  region  indicates  overlapping  field  of  view  with  both  eyes,  while  the 
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hatched  region  indicates  the  regions  that  can  be  seen  with  one  eye  only.  The 
black  indicates  regions  outside  the  FOV.  (These  regions  are  asymmetric  due  to  a 
margin  of  error  in  the  data.) 


Figure  19  Human  field  of  view  (From  [31]) 
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Figure  20  shows  a  dome  projection  of  a  complete  panorama  view 
that  extends  from  0-180  degrees  off-center,  from  our  camera  system. 


Figure  20  360°  view  dome  projection 

Figure  21  shows  the  human  field  of  view  overlaid  onto  the  dome 
projection,  to  illustrate  the  amount  of  a  complete  field  of  view  a  human  can  see  at 
one  possible  moment.  This  illustrates  the  limited  field  that  the  human  visual  system 
can  view  at  any  one  time.  For  improved  situational  awareness,  a  method  of 
expanding  the  human  field  of  view  could  be  beneficial. 
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Figure  21  Human  visual  field  sectioned  out  of  the  panoramic  dome  (After  [31]) 


The  human  eye  also  has  great  dynamic  range:  natively,  the  retina  is 
capable  of  a  200:1  contrast  ratio.  However,  when  it  adjusts  the  light  input  by 
changing  the  size  of  the  pupil  with  the  iris,  the  total  dynamic  contrast  ratio  of  the 
eye  is  approximately  1,000,000:1.  When  selecting  a  display  method  to  convey 
visual  information  to  a  user,  it  is  important  to  keep  these  numbers  in  mind. 


In  discussing  display  options,  we  can  focus  on  two  main  areas:  the 
display  technology,  and  modes  of  display. 
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b.  Technologies 

Display  technologies  for  augmented  reality  can  be  grouped  into  two 
main  categories:  optical  see-through  (OST)  and  video  see-through  (VST).  These 
have  different  attributes  and  are  appropriate  for  different  uses.  Their  main 
difference  is  that  OST  combines  the  optical  view  of  a  scene  with  computer¬ 
generated  imagery,  while  VST  uses  a  video  stream  as  its  background  scene,  and 
draws  the  computer-generated  augmentations  by  changing  the  pixels  of  the  video 
frame. 


(1)  Video  See-Through.  The  video  see-through  (VST) 
method  is  perhaps  the  easier  of  the  two  display  methods  to  implement.  The  key 
components  are  a  camera,  a  computer,  and  an  LCD,  OLED,  or  other  video 
display.  The  camera  takes  video  images  and  then  replaces  or  combines  some  of 
the  pixels  in  those  images  with  generated  graphics  pixels.  This  pixel  replacement 
has  advantages  such  as:  easier  alignment  of  view  with  annotations;  complete 
control  over  image  properties;  and  allowing  the  external  view  to  be  replaced 
completely  with  generated  images  for  greater  visibility. 

VST  form  factors  can  vary,  but  one  distinction  involving  this 
type  of  system  concerns  whether  the  camera  portion  of  the  system  is  attached  to 
the  user’s  head,  or  else  incorporates  a  remote  camera,  potentially  decoupled  from 
the  physical  pose  of  the  user.  This  latter  case  can  allow  the  system  to  have 
improved  capabilities  over  immersive  -pure”  AR,  since  the  camera  could  be  placed 
in  a  location  with  a  better  field  of  view,  or  even  in  a  position  that  is  more 
advantageous  but  perhaps  more  vulnerable.  It  also  opens  possibilities  for  merging 
AR  with  teleoperation  of  unmanned  systems. 

There  are  disadvantages  to  VST  systems  as  well.  One  is  that 
current  portable  cameras  have  resolutions  that  do  not  approach  that  of  the  human 
eye,  thus  reducing  range  and  detail  of  the  external  view.  Another  disadvantage  is 
the  time  required  to  process  and  render  each  video  frame  before  it  can  be 
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displayed  to  the  eye.  This  so-called  — Igiss  to  glass”  delay  produces  a  lag  that  can 
result  in  simulator  sickness.  Also,  if  the  world  is  being  viewed  through  a  video 
screen,  the  screen  itself  is  blocking  out  at  least  a  portion  of  the  real  world,  that 
would  otherwise  be  visible  to  the  user  if  they  were  viewing  the  world  without  the 
screen.  This  means  that  a  failure  of  the  display  system  could  cause  the  user  to  be 
at  least  somewhat  blinded  until  the  problem  is  cleared.  While  this  could  perhaps  be 
quickly  remedied  by  moving  the  display,  the  negative  effect  should  be  minimized  if 
possible. 


(2)  Optical  See-Through.  In  contrast  to  VSTs,  Optical  See- 
Through  (OST)  displays  operate  on  the  principle  of  optically  combining  the  light 
coming  in  from  the  world  with  an  overlay  image  generated  by  a  computer- 
controlled  source.  There  are  several  ways  of  implementing  OSTs:  aviation  HUDs 
have  used  these  for  decades,  but  for  head  mounted  displays,  these  are  in  an 
experimental  state  and  only  recently  have  become  available  on  the  market. 

Optical  Combiner:  An  Optical  Combiner  is  the  most  basic  of 
the  approaches  to  OST  displays:  a  partially-reflective  transparent  optical  element 
(half-silvered  mirror,  etc.)  is  placed  between  the  eye  and  the  world  at  an  angle.  An 
image  source,  such  as  a  small  LCD  screen,  is  placed  off-axis  to  this  view,  and 
partially  reflects  off  the  combiner  into  the  optical  path  to  the  eye.  In  this  way,  the 
image  on  the  source  appears  to  overlay  the  direct  view  of  objects  in  the  world. 
This  approach  is  fairly  simple,  but  has  several  disadvantages:  the  image  source 
must  be  quite  bright  to  be  seen  in  some  outdoor  circumstances,  the  image  can  be 
washed  out  by  the  external  view,  and  the  field  of  view  can  be  limited.  Some 
advances  have  been  made  in  optical  combiners,  such  as  the  use  of  dichroic 
coatings.  These  coatings  are  reflective  only  to  particular  wavelengths,  and  thus 
can  be  selectively  reflective  to  the  image  generator  wavelengths  while  passing 
much  more  external  light.  Another  issue  with  optical  combiners  is  that  it  is  difficult 
to  synchronize  the  focus  of  the  annotations  with  that  of  the  viewed  objects:  one 
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solution  for  this  problem  can  be  to  focus  the  annotations  at  infinity,  so  they  are 
always  in  focus  at  normal  operating  distances,  but  the  implications  of  this 
technique  have  not  been  fully  evaluated. 

Virtual  Retinal  Display:  Virtual  Retinal  Display  (VRD)  is  an 
emerging  technology  initially  developed  at  the  Human  Interface  Technology  Lab  at 
the  University  of  Washington.  In  a  VRD  system,  laser  beams  scan  directly  across 
the  retina  drawing  the  image  directly  in  the  eye  without  an  intermediary  display. 
The  VRD  improves  on  the  optical  combiner  method  because  it  provides  much 
greater  brightness  and  contrast  capabilities. 

Display  Masking:  One  issue  common  to  all  optical  see- 
through  systems  is  that  they  only  provide  additive  color:  they  can  add  color 
brightness  to  the  background,  but  can  not  make  it  darker.  This  limitation  is 
perhaps  not  so  important  for  annotative  AR,  but  for  simulative  AR  it  is  a  problem. 
To  render  realistic  images,  we  need  a  way  to  replace  the  background  with  our 
simulated  objects.  Just  brightening  the  — ijxels”  can  make  realism  difficult. 
Because  of  this  limitation,  successful  OST  systems  may  require  an  addition:  a 
-mask  display”  that  blocks  out  the  outside  view  where  the  augmentation  is  being 
drawn.  A  description  of  mask  displays  is  given  in  -The  End  of  Hardware,”  and  has 
not  been  fully  explored  [32], 

(3)  Head-Mounted  Display.  The  stereotypical  view  of  AR  has 
historically  involved  terminator  Goggles”:  displays  mounted  over  the  eyes,  like 
goggles  or  eyeglasses,  through  which  the  user  views  the  world  and  has 
information  overlaid  onto  that  view  by  the  AR  application.  This  form  factor  is  called 
the  Head-Mounted  Display,  or  HMD.  It  is  a  prevailing  notion  associated  with  AR  in 
the  emerging  public  eye,  but  it  is  not  necessarily  the  best  format  for  all 
applications. 
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(4)  Heads-Up  Display.  The  Heads-Up  Display  (HUD)  is 
usually  an  OST  display  hard-mounted  in  front  of  the  user  (usually  in  a  vehicle), 
which  allows  information  to  be  viewed  by  the  user  without  taking  the  eyes  off  the 
external  world.  Although  it  is  the  oldest  and  most  prevalent  AR  display  device,  it  is 
not  usually  recognized  as  such,  because  its  use  has  previously  been  restricted  to 
complex  combat  aircraft.  However,  this  technology  is  mature  and  quite  capable: 
AR  systems  with  HUDs  are  being  developed  for  automobiles. 

(5)  Head-Down  Display.  By  Head-Down  Displays  (HDD),  we 
are  referring  to  displays  that  are  mounted  in  front  of  the  user  (also  most  likely  in  a 
vehicle),  but  not  in  direct  line  of  sight  to  the  external  world.  This  requires  the  user 
to  take  his  eyes  off  the  external  world,  and  precludes  an  OST  configuration. 

(6)  Handhelds.  During  the  year  2009,  mobile  phone  handset 
features  reached  a  level  that  made  initial  commercial  AR  applications  feasible. 
Handheld  devices  with  AR  capacity  typically  have  a  camera  on  the  back  and  a 
display  on  the  front:  the  device  is  pointed  at  the  scene  to  be  viewed. 

3.  Sense/Model  Environment 

A  third  component  that  AR  systems  must  integrate  is  a  way  of  developing  a 
model  of  the  surrounding  environment.  This  is  because,  even  if  the  pose  of  the 
viewer  is  known,  placing  new  annotations  into  the  scene  requires  knowing  the  location 
where  the  annotation  is  being  placed.  Also,  having  a  model  of  the  environment 
facilitates  accurate  depiction  of  occlusion  of  objects  (say,  buildings)  by  other  objects 
that  are  nearer  to  the  viewer  (say,  a  tree  in  front  of  the  building).  AR  is  far  from  the 
only  use  for  accurate  terrain  models,  but  it  is  the  use  most  critical  to  our  project. 

Three-dimensional  geospatial  models  of  active  AOs  are  difficult  to  produce 
with  fidelity  in  real-time,  however.  Simulation  environments  using  actual  geospatial 
data  are  often  of  lower  quality  than  custom-made  imaginary  training  environments  that 
do  not  have  the  requirement  of  replicating  an  actual  particular  locale.  This  lack  of 
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fidelity  is  due  largely  to  an  absence  of  3D  data  sensors  on  the  battlefield,  and  to  the 
vast,  unwieldy  amounts  of  data  that  such  systems  can  produce. 

Urban  modeling  is  of  interest  to  this  project  because  3D  models  can  be  used 
for  multiple  purposes:  in  our  case,  the  most  pertinent  uses  are  for  model-based 
tracking  and  for  post-mission  third-person  visualization.  There  are  currently  few 
methods  of  comprehensively  capturing  an  urban  landscape  in  detail,  but  much  has 
been  done  recently  to  remedy  this  shortfall  [33], 

a.  Aerial  LiDAR 

Light  Detection  and  Ranging  (LiDAR)  is  a  method  for  scanning 
objects  in  order  to  determine  their  spatial  features.  This  involves  one  or  many 
individual  laser  rangefinders  scanning  across  the  object  of  interest,  and  sensing 
the  returning  light  to  calculate  distance  to  the  point  of  impact.  This,  when  coupled 
with  a  known  orientation  and  location  of  the  laser,  allows  calculation  of  the  location 
of  the  point  of  impact  of  the  light.  When  a  great  number  of  readings  are  measured, 
they  can  be  composited  into  a  high-detail  spatial  model  of  an  object.  The  largest 
use  of  this  technology  currently  is  the  scanning  of  terrain  from  the  air:  an  aircraft 
carries  the  laser  scanner  and  scans  the  ground  from  altitude,  capturing  enormous 
quantities  of  3D  points  (a  -point  cloud”).  This  cloud  then  be  used  in  multiple 
geospatial  applications:  it  can  be  interpolated  into  a  raster,  which  then  can  be  used 
to  generate  3D  grid  meshes.  Both  these  grid  meshes,  as  well  as  triangulated 
irregular  networks  (TINs),  can  be  created  from  the  scan  data  and  used  for  various 
purposes,  including  virtual  environment  terrain.  Figure  22  depicts  such  a  mesh. 
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Figure  22  Model  generated  from  LIDAR  scan  (from  [34]) 


Advantages 

•  Rapid  data  collection  over  large  areas 

•  Serves  as  a  basis  on  which  to  overlay  additional  data  for 
analysis 

Drawbacks 

•  Elevation-focused:  limited  in  providing  side  detail  on 
buildings 

•  Interpolation  is  necessary  between  points,  causing  vertical 
faces  to  appear  slanted  and  irregular  unless  extensive  post¬ 
processing  is  done. 

•  Limited  asset:  Airplane  must  be  scheduled 

b.  Manual  3D  Modeling 

3D  models  and  terrain  can  be  generated  by  artists  and  technicians 
using  the  multitude  of  products  available  for  this  purpose.  Some  of  these  products 
include  Blender™  [35],  Google™  SketchUp  [36],  and  Autodesk®  3DS  Max  [37], 
Maya  [38]  and  AutoCAD,  amongst  many  others.  This  is  the  original  method  of 
generating  detailed  terrain  models,  and  can  be  the  most  detailed. 
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Advantages 

•  High  fidelity  is  available;  the  artist  has  almost  complete 
control  over  the  modeling  process,  and  incredible  levels  of 
detail  are  possible 

•  Optimization  measures  are  readily  possible,  such  as  levels 
of  detail  (LOD) 

Drawbacks 

•  There  is  a  large  gap  between  fictional  terrain  and  terrain 
meant  to  correspond  to  reality.  This  can  be  seen  in  various 
game-based  simulations  used  for  training  as  well  as 
entertainment  (e.g.,  VBS2  and  ARMA2,  which  are  based  on 
the  same  engine,  but  for  different  purposes).  Fictional  maps 
can  be  continuously  updated  and  improved  quickly,  but 
terrain  meant  to  duplicate  the  actual  world  has  the  important 
constraint  of  matching  the  actual  locations  and  attributes  of 
real  objects. 

•  Modelers  often  are  not  the  users  of  the  models,  and  do  not 
have  personal  experience  with  the  location  being  modeled, 
which  reduces  accuracy  and  fidelity.  This  process  can  be 
improved  by  using  the  latest  modeling  tools  that  allow 
construction  of  models  using  collections  of  photographs  of 
the  area  or  building  being  modeled,  but  this  depends  on 
source  photos  from  operational  elements  working  in  the 
vicinity. 

C.  EXTANT  AUGMENTED  REALITY  SYSTEMS 

There  currently  are  augmented  reality  systems  in  service  within  the  DoD  and 
academia,  some  of  which  have  been  available  for  30+  years.  The  DoD  systems 
mainly  have  been  in  use  in  the  aviation  domain,  since  aviation  has  both  requirements 
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and  platform  capabilities  that  are  compatible  with  AR  systems  in  general.  However, 
there  is  much  research  going  on  in  the  DoD  with  respect  to  manportable  wearable  AR 
systems  for  the  current  and  future  soldier. 

1.  Aviation  Augmented  Reality 

AR-like  technology  has  been  operational  in  the  DoD  for  several  decades. 
While  not  what  most  would  consider  true  augmented  reality,  some  of  these 
technologies  can  provide  capabilities  that  an  AR  system  affords. 

a.  Head-Up  Displays 

Head-Up  Displays  (HUDs)  have  been  in  use  on  military  aircraft  since 
the  1950s:  these  provide  a  see-through  display  system  mounted  above  the 
instrument  panel,  which  provides  key  pieces  of  information  to  the  pilot.  Most  of 
these  information  components  are  not  geo-registered,  but  some  data  (such  as 
weapon  points  of  impact)  are  registered  radially  relative  to  the  aircraft.  This  is  a 
rudimentary  form  of  AR. 

b.  Head-Mounted  Displays 

Head-Mounted  Displays  (HMDs)  take  the  HUD  idea  one  step  further. 
Instead  of  a  display  fixed  to  the  aircraft  in  front  of  the  pilot,  the  HMD  fixes  the 
display  to  the  pilot’s  helmet.  This  helmet  display  is  combined  with  a  head-tracking 
system  in  order  to  provide  a  constantly  updated  view  with  spatially  and  temporally 
relevant  information  to  the  pilot  at  all  times,  regardless  of  direction  of  gaze. 
Additionally,  this  method  can  incorporate  various  types  of  synthetic  vision  aids, 
such  as  thermal  or  electro-optical  sensors,  to  give  the  user  the  capability  to  see  in 
reduced-visibility  environments.  A  groundbreaking  example  of  this  capability  is  the 
AH-64  Apache  Integrated  Helmet  and  Display  Sight  System  (IHADSS)  [39]:  this 
system  combines  a  gimbaled  thermal  imager/visual  camera  mounted  on  the  front 
of  the  aircraft  that  can  be  -slaved”  to  the  position  of  the  pilot’s  head,  giving  the  pilot 
the  perception  that  he  can  see  in  the  dark.  A  newer  HMD  in  use  in  current  U.S. 
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fighter  aircraft  is  the  Joint  Helmet-Mounted  Cueing  System  (JHMCS)  [40],  which 
offers  enhanced  synthetic  vision  and  high  angle  employment  of  weapons.  These 
systems  might  be  adapted  to  enable  AR  capabilities. 

2.  Manportable 

a.  Land  Warrior 

Land  Warrior  [41],  seen  in  Figure  23  is  a  system  to  provide  the 
dismounted  infantryman  the  capabilities  of  FBCB2  integrated  with  a  weapon 
system  and  tactical  communications. 


Figure  23  U.S.  Army  infantryman  with  Land  Warrior  system  (From  [42]) 
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The  Land  Warrior  system  integrates  several  subsystems: 


•  A  central  processing  unit  running  the  LINUX  operating 
system 

•  A  weapons  subsystem  incorporating  an  M4  carbine  with 
thermal  and  video  camera  sights  and  a  multifunction  laser 
rangefinder 

•  A  helmet  system  with  head-mounted  display  (HMD)  and 
radio  headset.  The  HMD  is  used  to  display  a  tactical  map 
and  communications  interface,  as  well  as  the  sight  picture 
from  the  weapon  sights.  This  allows  the  soldier  to  look  and 
shoot  around  corners  and  obstructions  while  remaining 
under  cover 

•  A  navigation  subsystem  integrating  GPS  and  dead  reckoning 
sensors 

•  A  radio  communications  subsystem  based  on  the  Enhanced 
Position  Location  Reporting  System  (EPLRS) 

Land  Warrior  was  first  field-tested  in  2000.  Due  to  excessive  weight 
and  cost  (40  lbs.  and  $85,000  per  set),  the  program  was  cancelled  in  2007.  Land 
Warrior  was  test  fielded  to  one  Stryker  infantry  battalion  deployed  to  Iraq,  however, 
and  enjoyed  some  success,  particularly  after  it  was  improved  with  soldier  input. 
Components  deemed  unnecessary  were  stripped  out,  after  which  the  unit  found 
the  system  very  valuable;  particularly  for  leaders.  This  success  has  re-energized 
the  program,  and  it  now  is  in  service  with  a  complete  Stryker  Brigade. 

Advantages 

•  Tactical  communications  among  infantry  soldiers 

•  Vastly  improved  situational  awareness  for  equipped  units 
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Disadvantages 

•  HMD  blocks  the  view  of  the  world  for  one  eye:  users  tend  to 
flip  the  display  out  of  the  way  to  see,  which  impacts 
continuity  of  SA 

•  Battery  power  endurance  is  still  an  issue  for  extended 
operations 

3.  True  Augmented  Reality 

For  quite  some  time,  the  Department  of  Defense  research  community,  as 
well  as  academia  and  industry,  have  been  experimenting  with  fully  geo- 
registered  full-blown  AR  systems  for  individual  use.  Some  examples  of  systems 
follow. 


a.  Wearable 

(1)  BARS.  The  Battlefield  Augmented  Reality  System  [43]  is 
a  Naval  Research  Lab  (NRL)  project  to  implement  a  wearable  AR  system  for 
experimentation.  This  has  been  a  widely  published  DoD  research  project,  and  has 
covered  human  as  well  as  technical  factors.  This  system  consists  of  a  wearable 
computer  with  HMD,  and  has  been  evaluated  for  forward  observer  training,  among 
other  topics. 


(2)  MARS.  The  Mobile  Augmented  Reality  System  [44]  is  a 
project  at  Columbia  University  that  also  explores  wearable  AR  capabilities.  It  is 
significant  because  it  has  looked  at  improving  understanding  of  urban 
surroundings  on  the  Columbia  campus. 

(3)  Tinmith  [45],  The  Wearable  Computer  Lab  at  the 
University  of  South  Australia  has  been  a  longtime  AR  research  organization.  The 
lab’s  most  prominent  project  has  been  the  Tinmith  AR  system.  This  system  is 
similar  to  other  wearable  AR  suites.  It  has  been  used  to  implement  ARQuake  [45], 
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an  AR  version  of  the  Quake  first-person-shooter  game  that  demonstrated  key 
concepts  in  live  training  capability  using  AR. 

These  systems,  and  others  like  them,  have  had  success  in 
trailblazing  AR  for  possible  military  applications.  However,  their  performance  is  not 
production-ready,  because  of  limitations  of  displays,  tracking,  and  power  supply. 

(4)  Handheld.  Two-thousand-nine  was  a  breakout  year  for 
mobile  AR,  as  several  mobile  phone  platforms  introduced  hardware  features 
necessary  for  AR:  particularly  inertial  sensors  and  compasses.  This  capability 
allowed  the  development  of  a  wide  variety  of  phone  applications  involving  AR, 
using  the  onboard  camera  and  pose  sensors.  This  has  demonstrated  the 
feasibility  of  handheld  AR,  although  accuracy  and  precision  of  pose  are  not  high. 

b.  Tablets 

AR  platforms  based  on  more  powerful  tablet  computers  have 
produced  some  promising  results.  Using  this  platform,  the  VIDENTEA/ESP’R  [46], 
[47]  system  from  the  Technical  University  of  Graz  has  demonstrated  AR  exposure 
of  subsurface  utilities  in  an  urban  setting.  Starting  with  a  pre-existing  model,  this 
system  can  be  used  to  — iew”  pipes,  wires  and  other  subsurface  structures,  in 
order  to  deconflict  digging  and  other  utility  operations.  Such  a  system  is 
dependent,  however,  on  an  accurate,  detailed  subsurface  map. 
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V.  FINDINGS 


Over  the  course  of  this  research,  we  have  identified  several  areas  to  which  a 
deployed  system  as  we  propose  can  contribute.  Also,  this  thesis  does  make  some 
contributions  to  the  military  Augmented  Reality  body  of  knowledge  in  itself.  The  first  is 
identifying  distinct  modes  of  augmenting  reality.  The  second  is  outlining  different 
techniques  for  providing  informational  augmentation;  and  the  third  is  to  begin  system 
design  and  construction  of  a  useable  AR  prototype. 

A.  “FLAVORS”  OF  AUGMENTED  REALITY 

Upon  conclusion  of  a  literature  survey,  we  found  that  the  term  -Augmented 
Reality”  has  a  broad  meaning,  and  has  been  used  to  refer  to  techniques  which  share 
some  commonality  (such  as  the  combination  of  -real  world”  with  augmentations),  but 
which  have  different  intents.  To  cope  with  this,  we  have  named  two  different 
categories  of  augmentation,  which  illustrate  the  different  intents  for  the  use  of  these 
two  -flavors.” 

1.  Simulative  Augmentation  (SimA) 

Simulative  augmentations  (as  seen  in  Figure  24  )  have  the  property  of 
appearing  to  be  -real”  physical  objects,  and  have  visual  properties  commensurate  with 
the  objects  onto  which  they  are  overlaid.  Consequently,  shadows,  brightness  and 
other  properties  of  the  augmentations  must  be  adjusted  to  levels  appropriate  to  the 
objects  in  view.  In  addition,  the  appearance  of  transparency  must  be  reduced,  so  that 
the  overlain  scene  is  occluded  by  the  augmentation:  if  the  underlying  scene  can  be 
seen  significantly  through  the  augmentation,  then  realism  can  be  compromised. 
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Figure  24  Example  of  Simulative  AR.  The  HMMWV  on  the  left  is  computer¬ 
generated. 

2.  Annotative  Augmentation  (AA) 

Annotative  augmentations  are  not  meant  to  be  taken  as  -real,”  but  rather  are 
used  to  inject  information  into  a  scene,  tying  pieces  of  information  to  the  viewed  real 
world.  Annotative  augmentations  also  must  take  into  account  properties  of  the  scene, 
but  to  a  lesser  extent  than  simulative  augmentations. 

Both  Simulative  and  Annotative  Augmentation  are  true  AR:  in  both,  the 
augmentations  are  spatially  registered,  and  act  as  if  they  truly  exist  at  a  particular 
place  -m  the  world.”  (This  is  in  contrast  to  the  overlay  of  non-registered  textual  and 
other  information  onto  an  external  view,  such  as  speedometer  HUDs  that  are 
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available  on  some  automobiles.)  The  real  difference  is  the  intent  of  the  content:  SA 
seeks  to  portray  simulated  physical  objects  existing  in  places  where  they  are  not, 
while  AA  seeks  to  reveal  information  about  objects  that  already  exist.  Currently,  there 
is  a  great  deal  of  work  to  be  done  in  human-system  integration  just  in  these  different 
areas  of  information  display  and  modality.  We  feel  this  distinction  is  helpful  because  it 
illuminates  the  idea  that  both  -flavors”  are  each  AR,  while  also  pointing  out  a 
significant  difference  between  them.  Figure  25  illustrates  AA. 


Figure  25  Example  of  Annotative  AR.  (Background  image  taken  by  the  author  in 

Baghdad,  Iraq,  in  2006.) 
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In  the  case  of  our  proposed  system,  Annotative  AR  is  the  preferred  -flavor”  of 
AR.  Simulative  AR  is  not  generally  applicable,  since  we  are  intending  to  display 
information  that  is  understood  as  non-physical  in  nature.  Our  annotations  must  be 
easily  discernable  from  real  objects  in  the  field  of  view. 

B.  ANNOTATION  TAXONOMY 

In  the  area  of  content  and/or  graphics,  annotation  in  AR  can  be  implemented  in 
an  assortment  of  ways  [48],  For  this  project,  we  identified  four  general  divisions  into 
which  annotations  can  be  categorized:  Icons,  3D  Spatial  objects,  Hyperlinks  and 
Regional  Information. 

1.  Icons 

Icons  are  the  basic  form  of  spatially-registered  annotations  that  can  be 
displayed  using  Augmented  Reality.  Icons  take  a  similar  form  to  the  icons  found  on 
computer  desktops:  small  pictures  that  graphically  suggest  the  information  for  which 
they  are  a  link.  AR  icon  annotation  does  not  necessarily  have  to  include  links, 
however.  At  their  simplest  form,  icons  can  be  mere  spatially-registered  dots,  but  their 
informational  content  can  increase  in  parallel  with  their  graphical  sophistication. 

There  are  several  methods  by  which  we  can  modify  icons  in  an  AR  application, 
in  order  to  convey  information  [48],  For  analyzing  and  illustrating  their  application  to 
this  domain,  we  will  use  the  case  of  attacks  on  friendly  forces  as  an  example  case. 

a.  Color 

Annotations  can  be  color-coded  in  different  ways  to  convey 
information.  One  simple  example  is  the  convention  that  pieces  of  friendly 
information  are  colored  blue,  enemy  red,  neutral  green,  and  so  on.  Color  can 
signify  categories  or  quantities:  in  our  application,  for  instance,  confirmed  IED 
icons  could  be  colored  red,  while  suspected  IED  locations  could  be  orange,  and  so 
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on.  Conversely,  one  could  color  IED  attacks  based  on  what  specific  type  of  IED 
was  used  (known  or  not),  or  how  many  people  were  casualties  of  that  specific 
attack. 


(1)  Spatial  Variation.  One  way  we  can  convey  extra 
information  is  to  vary  the  color  of  an  annotation  across  its  expanse.  This  can  be 
implemented  using  discrete  variation  (coloring  different  primitive  features  of  the 
icon  in  different  colors),  or  else  in  a  gradient-type  continuous  variation.  Spatial 
variation  must  be  applied  with  care,  however,  because  as  distance  to  the  icon’s 
location  increases,  the  smaller  the  icon  tends  to  become,  and  the  greater  the 
chance  of  losing  the  detail  that  spatial  variation  requires.  This  issue  can  be 
mitigated  by  not  scaling  icons  purely  by  distance,  but  with  a  variable  scaling 
function  (e.g.,  scale  =  In(distance).) 

(2)  Temporal  Variation.  Another  way  we  can  convey 
information  is  by  varying  the  color  of  an  annotation  over  time.  This  can  include  an 
alpha  channel  as  well,  so  transparency  is  an  option.  Icons  can  be  made  to  -blink” 
by  rapidly  varying  either  their  transparency  or  their  color.  This  can  recreate  the 
effect  of  red  warning  lights  on  radio  transmission  towers,  for  example. 

A  way  we  could  apply  this  to  our  IED  example  is  to  cause 
lEDs  to  blink  or  change  color  at  a  rate  related  to  their  suggested  -severity”:  faster 
blinking  IED  icons  could  indicate  predicted  danger  level.  One  caveat  is  that  red 
blinking  items  should  probably  not  be  used  to  indicate  information  other  than  that 
which  is  dangerous,  life-threatening,  or  of  some  other  emergency  nature. 

b.  Shape 

Shapes  of  annotations  can  indicate  a  great  deal  of  information.  The 
basic  shape  of  an  icon  can  communicate  the  most  fundamental  details  about  its 
content.  Militaries  around  the  world  utilize  this  fact,  as  evidenced  by  the 
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abundance  of  available  tactical  graphics  in  their  doctrine:  just  by  glancing  at  a  unit 
icon,  we  can  see  easily  what  size  and  type  of  unit  it  represents. 

More  complex  methods  of  graphical  variation  have  the  potential  to 
convey  a  great  deal  more  information.  An  interesting  foray  into  this  notion  is  the 
1973  paper  by  Chernoff  [49]  on  the  technique  of  displaying  faces  to  represent 
multidimensional  data:  this  takes  advantage  of  the  fact  that  the  human  brain  is 
structured  to  recognize  slight  variations  in  facial  expressions.  In  this  example, 
faces  are  drawn  with  dimensions  that  are  linked  to  quantities  (e.g.,  length  of 
mouth,  spacing  of  eyes,  slant  of  eyebrows)  Trends  can  be  seen  easily,  for 
instance,  if  most  of  the  faces  in  a  given  data  set  are  -happy.”  The  use  of  facial 
features  or  similar  constructs  as  icons  is  worth  further  exploration. 

A  good  application  of  shape  variation  is  in  indicating  important 
quantities.  Shape  variation  is  another  way  that  we  can  vary  an  I  ED  attack  icon  in 
size  to  indicate  the  number  of  casualties. 

The  size  and  shape  of  an  annotation  also  can  be  varied  over  time. 
Thus,  instead  of  blinking,  we  can  make  icons  swell  and  shrink  periodically. 
Changes  in  size  and  shape  of  the  icon  can  similarly  be  tied  to  a  quantity. 

c.  Textual  Content 

Another  way  we  can  convey  information  in  an  annotation  icon  is  to 
display  various  textual  elements  as  part  of  the  icon  itself.  Displaying  textual 
elements  is  not  the  same  as  displaying  quantities  of  written  text  in  a  spatial 
manner.  Instead,  it  means  as  little  as  single  characters,  up  to  abbreviations  and 
short  words  can  be  incorporated  into  the  icon  itself.  In  our  example,  the  icon  could 
incorporate  a  single  letter  to  indicate  the  type  of  attack:  -h  for  IED,  — Sfor  sniper, 
-M”  for  mortar,  and  so  on. 
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d.  Time 

Incorporating  temporal  duration  as  another  piece  of  annotatable 
information  can  be  useful.  For  instance,  we  can  vary  the  color  of  an  icon  over 
time:  to  indicate  that  one  attack  is  very  recent  and  another  is  old.  We  could  -fade” 
icons  as  time  has  passed.  These  fading  icons  would  allow  the  viewer  to 
distinguish  the  age  of  particular  attacks. 


Figure  26  Example  of  icon  depicting  an  IED  (the  orange  star  on  the  left) 

2.  3D  Spatial  Objects 

For  some  types  of  information,  it  is  useful  to  display  spatial  extent,  rather  than 


just  point  icons.  In  this  case,  we  can  increase  the  dimensions  of  the  annotations,  in 
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order  to  create  lines,  regions,  or  even  volumes.  Aside  from  the  increased  dimensions, 
this  annotation  type  is  very  similar  to  an  icon.  Figure  26  and  Figure  27  illustrate  some 
3D  annotations. 


Figure  27  Example  of  a  3D  spatial  object  (the  green  line  depicting  Route  ABC123) 

3.  Registered  Hyperlinks 

A  more  complex  method  of  conveying  information  with  AR  is  to  use  of  icons  as 
-spatial  hyperlinks.”  In  this  case,  while  the  icon  itself  can  convey  information,  it  can 
also  serve  as  a  pointer  to  call  up  more  detailed  information  on  a  particular  location  in 
space.  An  example  of  this  today  is  the  popup  -bubble”  found  in  Google  Earth:  an  icon 
is  used  to  convey  a  small  amount  of  information  but  then  expands  when  clicked  to 
display  textual  content  (as  well  as  other  media  options). 
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This  method  of  employing  spatial  hyperlinks  allows  information  to  be  kept  at  a 
useable  level.  If  we  were  to  affix  text  labels  onto  points  in  order  to  describe  them,  the 
field  of  view  could  quickly  become  saturated. 

Combining  icons,  links  and  editable  text  can  potentially  provide  a  rich  interface 
for  organizing,  editing  and  presenting  relevant  tactical  information.  One  way  that 
combined  icons  can  be  implemented  is  as  a  -3D  wiki.”  Wikis  are  Web  sites  that  are 
directly  editable  by  users  from  the  browser,  and  are  organized  using  categories  and 
tags,  so  that  they  are  structured  in  a  net,  and  not  a  tree  hierarchy.  This  net  structure 
allows  easy  insertion  of  new  information.  Our  proposed  use  case  for  adding  to  a  3D 
wiki  is  as  follows. 

•  User  — ctks”  an  —ad  icon”  button,  putting  the  system  in 
— ad”  mode 

•  User  indicates  the  object  to  be  annotated,  retrieving  the  3D 
coordinates  of  the  point  to  place  the  icon 

•  The  icon  is  — doble-clicked,”  and  a  small  Web  browser  pane 
pops  up  with  a  wiki  in  — ad  page”  mode 

•  Information  on  the  object  is  entered  (either  on  the  spot,  or  at 
a  later  time). 

•  The  page  is  closed,  and  the  icon  saved:  the  link  to  the 
created  wiki  page  is  saved  as  part  of  the  icon,  but  this  data  is 
kept  separate  from  the  wiki  database. 

Implementing  this  method  allows  multimode  interaction  with  all  pieces  of  data: 
the  AR  annotation  and  the  associated  wiki  page.  This  technique  is  implemented  in  a 
similar  manner  in  geobrowsers,  such  as  Google  Earth. 

4.  Regional  Information 

In  some  cases,  there  is  information  that  is  relevant  to  the  user  that  is 
particular  to  user  location,  but  on  a  much  larger  scale  than  would  be  manageable 
using  icons.  An  example  of  such  information  might  be  the  existence  of  a  local 
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curfew  in  a  particular  neighborhood.  This  information  is  spatially  associated  with 
that  particular  neighborhood,  but  placing  one  icon  in  the  middle  of  the 
neighborhood  to  indicate  this  would  be  a  misrepresentation.  We  could  also  place 
icons  throughout  the  neighborhood,  but  that  would  clutter  the  visual  screen  real 
estate.  A  recommended  way  to  display  this  type  of  information  would  be  to 
indicate  the  area  under  curfew  with  a  polygon  drawn  on  its  boundaries  in  a 
geobrowser,  and  have  a  — state  board”  in  the  corner  of  the  viewer’s  display  that 
could  display  text  stating  — cudw  here,  2000-0600  Fridays,”  or  something  similar. 
This  message  would  only  appear  for  the  AR  user  if  he  were  located  inside  the 
boundary  polygon.  In  this  way,  the  information  displayed  is  spatially  filtered. 

Note  that  this  category  of  augmentation  sits  on  the  border  between  “AR” 
and  — nbAR”:  the  information  displayed  is  dependent  on  the  position  of  the 
viewer,  but  the  annotations  are  not  themselves  spatially  registered  in  the  view. 

C.  CATEGORIZATION  OF  TACTICAL  ANNOTATIONS 

Over  the  course  of  this  research,  we  examined  various  potential  annotations  to 
be  displayed  by  the  system:  our  analysis  incorporated  relevant  military  operational 
experience,  and  developed  some  examples  of  information  that  could  be  portrayed 
through  annotations.  We  analyzed  these  examples  by  the  following  metrics  with  the 
goal  of  evaluating  the  elements  of  information  by  technical  feasibility  and  value  to  the 
soldier: 


1.  Useful  Elements  of  Information/Knowledge 

There  are  innumerable  pieces  of  information  that  are  useful  to  a  leader  in 
combat.  Using  subjective  judgment  based  on  operational  experience,  we  identified 
some  key  elements  that  would  be  essential  to  any  AR-based  tactical  knowledge  /  SA 
system.  We  then  analyzed  their  necessary  qualities,  based  on  the  attributes 
described  in  the  previous  sections,  and  represented  this  information  in  Table  4  . 
These  pieces  of  information  include: 
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Locations  of  Friendly  Forces:  Location  of  friendly  units  is  of  high  importance 
to  leaders  maneuvering  in  combat  situations,  due  both  to  the  importance  of 
maintaining  an  accurate  tactical  picture  of  the  battlefield,  as  well  as  the  essentiality  of 
avoiding  incidents  of  friendly  fire. 

Location  of  Enemy  Forces:  The  location  and  disposition  of  enemy  elements 
in  the  local  area  is  something  that  every  combat  leader  wants  to  know.  This  is 
complicated,  however,  by  the  fact  that  the  enemy  is  generally  noncompliant  with  our 
attempts  to  track  him. 

IED  Locations  (Current  /  Suspected;  Historic):  The  currently  suspected 
location  of  Improvised  Explosive  Devices  (lEDs)  is  of  course  of  great  concern  to  the 
soldier  in  combat.  However,  the  historic  locations  of  exploded  or  found  lEDs  can  be 
extremely  important  as  well,  because  attacks  tend  to  occur  in  places  that  are 
conducive  to  such  attacks.  This  can  be  a  cue  to  the  patrol  leader  that  the  IED  threat 
might  be  elevated  when  approaching  certain  locations. 
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Table  4  Annotation  analysis 


Elements  of  Information/ 
Knowledge 

Annotation  Content  &  Metrics 

Annotation 

Techique 

Update 

Duration 

Variables 

Criticality 

Precision 

Import. 

(C+P) 

Data  Source 

Description 

Locations  of  Blue  Forces 

icon 

Heartbeat 

Minutes 

Type,  Est.  Precision, 
Velocity  Vector, 
Callsiqn/Channel 

4 

3 

9  7 

Blue  Force  Pos 
Updates 

3D  Unit 
Symbols 

Location  of  Enemy  Forces 

icon 

On 

Command 

Minutes 

Type,  Est.  Precision, 
Velocity  Vector 

4 

2 

6 

Spot  Reports 

3D  Unit 
Symbols 

IED  Locations 

Current/ 

Suspected 

Icon 

On 

Command 

Hours 

Type,  Size 

4 

4 

9  8 

Spot  Reports 

IED  Icons 

Historic 

Static 

Static 

Type,  Size,  DTG 

3 

4 

Q  7 

Tactical 

Database 

Sniper  Attack 
Positions 

Current/ 

Suspected 
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Enemy  Attack  Positions  (Current/Suspected;  Historic):  Displaying 
positions  from  which  the  enemy  may  or  has  already  attacked  is  very  important, 
because,  again,  they  are  good  starting  places  to  start  when  searching  for  threats. 

Enemy  Engagement  Zones  (Current/  Suspected;  Historic):  As  in  the  case 
of  positions  from  which  to  be  attacked,  areas  that  are  more  dangerous  or  vulnerable 
are  good  to  identify,  so  they  can  be  avoided  if  possible. 

Routes:  Much  use  is  made  in  combat  (urban,  especially)  of  naming  routes 
through  areas.  These  can  be  closed  at  times,  pending  a  tactical  situation;  can  be 
blocked;  have  trafficability  properties  which  make  them  more  or  less  desirable  on 
which  to  travel;  have  lesser  or  greater  rates  of  friendly  or  civilian  traffic;  and  many 
other  properties  that  could  be  displayed  to  the  user  to  help  make  tactical  maneuver 
decisions. 

Person  of  Interest  (Current  Target;  Historic):  Locating  persons  of  interest 
(whether  targets  or  allies)  plays  a  big  role  in  counterinsurgency  combat.  Displaying 
the  locations  of  particular  individuals  and  having  the  ability  to  link  to  historical 
information  on  them  can  assist  the  tactical  leader  in  many  types  of  activities. 

Subsurface  (Culverts,  Sewer,  Utilities),  Bridges:  Subsurface  infrastructure 
is  relevant  both  as  a  potential  I  ED  emplacement  location,  as  well  as  playing  a  role  in 
understanding  the  state  of  essential  services  in  an  area.  This  is  true  as  well  for 
bridges,  which  have  other  properties  such  as  weight/load  class  and  state  of  repair. 

Host-Nation  Facilities:  Locations  of  local  civil  institutions  and  facilities  are 
perhaps  mundane  but  an  essential  class  of  information  that  is  used  on  a  daily  basis  in 
a  COIN  campaign. 

Cleared  CASEVAC  Helicopter  Landing  Zones:  Pre-identified  locations  of 
helicopter  landing  zones  or  areas  are  a  critical  piece  of  information  for  anyone  fighting 
a  COIN  operation,  especially  in  urban  areas,  because  helicopter  casualty  evacuation 
(CASEVAC)  is  the  quickest  and  best  way  of  getting  wounded  soldiers  to  medical 
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treatment.  Not  all  places  in  an  urban  setting  are  suitable  for  landing  helicopters,  and 
hazards  can  be  pre-surveyed  and  displayed  for  both  ground  forces;  and  perhaps  the 
helicopter  as  well. 

Local  Cultural  Events:  Local  events  of  daily  life  among  the  populace  are 
always  good  to  know,  and  can  be  confined  to  certain  locations.  Knowing,  for  instance, 
that  there  is  a  daily  curfew  in  effect,  or  that  it  is  market  day  in  a  particular 
neighborhood  is  useful  and  can  be  decisive. 

Blue  Force  Events:  Friendly  force  events  occurring  in  the  local  area  are  also 
important  to  know,  and  sometimes  quite  difficult  to  gather.  An  example  could  be  a  unit 
conducting  an  operation  in  a  particular  area  that  happens  to  be  along  a  heavily 
trafficked  route:  units  using  that  route  could  understand  more  of  the  tactical  situation, 
and  fratricide  could  be  more  easily  prevented. 

"Guidons"  Announcements:  -Guidons”  calls  are  quick  announcements 
(traditionally  over  a  voice  radio  network)  to  notify  all  units  in  an  organization  on  a 
particularly  important  piece  of  information  that  is  time  sensitive.  If  a  unit  passing 
nearby  or  through  another  unit’s  area  of  operations  can  automatically  receive  such 
notifications  based  on  their  spatial  location,  the  information  can  be  disseminated 
farther  than  just  the  land-owner  unit’s  organic  components. 

2.  Analysis  Factors 

Annotation  Technique:  This  refers  to  the  suggested  type  of  annotation  most 
suitable  to  portray  the  particular  element  of  information.  In  our  case,  Icon,  3D  Spatial, 
Hyperlink  or  Regional  Information 

Update:  This  refers  to  the  method  by  which  the  annotation  is  introduced  or 
updated  in  the  system.  This  can  be  periodic,  such  as  the  periodic  -heartbeat,”  which 
continually  updates  friendly  positions  on  the  tactical  network;  on  command,  as  soon 
as  the  information  is  known;  static  and  unchanging,  which  applies  most  to  historic 
events  or  locations;  and  per  predefined  schedule. 
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Duration:  Duration  refers  to  the  length  of  time  that  a  piece  of  information  is 
most  likely  of  use.  Items  involved  in  current  operations  most  likely  will  expire  in 
usefulness  at  some  time,  at  which  point  these  same  items  would  then  pass  into  the 
-historic”  category,  and  exist  statically. 

Variables:  Within  the  mentioned  general  elements  of  information  there  are 
sub-pieces  of  information  that  are  useful  to  portray  or  store.  Developing  annotations 
that  can  display  as  much  of  these  as  is  useful,  without  overwhelming  the  user  can  add 
more  value  to  the  tactical  system. 

Criticality:  A  subjective  rating  of  the  importance  of  the  information  being 
annotated,  on  a  scale  of  1-4,  with  4  indicating  the  most  critical. 

Precision:  The  relative  importance  of  high  precision  in  spatial  placement  of  the 
annotations.  Rated  on  a  scale  of  1-4,  with  4  requiring  the  most  precision. 

Import.  (C+P):  The  sum  of  criticality  and  precision,  meant  to  indicate  the 
amount  of  benefit  provided  if  an  AR  augmentation  were  used  to  display  the 
information,  vs.  current  methods. 

Data  Source:  Presumably,  in  such  a  system  as  we  are  discussing,  data  will 
enter  the  system  in  different  ways.  Because  of  the  potential  for  huge  quantities  of 
information  being  sent  across  tactical  networks,  it  may  be  desirable  to  have  a  local 
replica  of  an  appropriate  tactical  database  on  a  high-capacity  storage  device  located 
on  the  system,  which  can  be  synched  and  updated  prior  to  mission  start,  in  order  to 
confine  network  traffic  to  only  new  or  changed  data.  For  this  dynamic  data,  something 
similar  to  the  current  spot  report  system  on  the  mobile  tactical  network  would  suffice. 

Description:  A  description  of  what  a  possible  instance  of  the  information 
element  could  be  like. 
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Figure  28  Depiction  of  a  (goal)  view  through  our  system,  with  Annotative 

Augmentations  displaying  current  and  historical  tactical  information 


Example:  In  Figure  28  ,  many  of  the  elements  of  information  described  in 
Annotation  analysis  are  depicted:  Icons  depict  a  sniper  position,  an  operational 
objective,  and  an  intended  breach  point;  a  3D  Spatial  annotation  -Route  Chargers,” 
the  -OBJ  Raiders”  icon  has  a  hyperlink  button  which  will  open  a  Web  page  with 
information  on  the  objective;  and  in  the  lower  right  corner,  regional  information  shows 
current  goings-on  in  the  local  area.  In  addition,  enhancements  are  provided  to  assist 
in  interpretation  of  the  scene,  such  as  an  overhead  map,  text  warnings,  and  arrows  to 
highlight  locations  of  threats. 
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D. 


SYSTEM  ANALYSIS  AND  DESIGN 


Once  an  analysis  on  the  desirable  requirements  for  the  annotations  was  done, 
we  then  analyzed  capabilities  and  requirements  of  hardware  and  software  in  order  to 
provide  an  initial  concept  for  a  prototype  system  that  would  be  capable  of  producing 
the  intended  annotations  and  views. 

1.  Vehicle  Platform 


Figure  29  PARPICE-V 

The  foundation  of  the  PARPICE  system  is  the  vehicle  platform,  or  PARPICE- 
V,  which  can  be  seen  in  Figure  29  and  Figure  30.  In  our  case,  this  platform  is  a  2005 
Toyota  Tacoma  quad-cab.  Our  particular  vehicle  came  outfitted  with  two  storage 
batteries,  a  power  conversion  system,  and  a  telescoping  mast  with  associated  air 
compressors,  which  provides  a  robust  infrastructure  on  which  to  build  our  system. 
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Figure  30  The  PARPICE  vehicle  (rendered).  Note  the  LiDAR  and  panoramic  camera 

atop  the  telescoping  mast. 

2.  Desired  Functionality 

As  addressed  previously,  it  is  our  intent  eventually  to  address  two  operational 
problems:  the  -Knowledge  Persistence  Problem,”  and  the  -Immediate  Tactical 
(Situational)  Awareness  Problem.”  Therefore,  we  have  identified  capabilities  that 
would  each  address  both  of  our  identified  problems. 

We  captured  the  system  functions/capabilities  and  their  relations  in  the 
diagram  in  Functional  breakdown  diagram.  This  diagram  outlays  the  identified 
functional  capabilities,  as  well  as  the  inputs  and  outputs  of  each  function.  Additionally, 
the  physical  or  software  components  that  implement  each  functionality  group  are 
specified  across  the  bottom  of  the  diagram,  similar  to  the  IDEFO  system  diagram 
standard. 
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In  addition  to  developing  the  desired  functionality,  we  must  simultaneously 
minimize  the  negative  impact  of  the  use  of  such  a  system.  This  leads  us  to  identify 
several  constraints  we  must  mitigate: 

•  This  system  is  intended  to  add  capabilities  to  the  soldier,  not 
replace  any  current  capability.  Because  of  this,  we  determined  that 
the  system  should  not  impede  the  user  in  any  significant  way.  This 
caveat  means,  in  particular,  that  complete  system  malfunction  will 
not  impede  mission  accomplishment  by  the  user. 

•  This  system  is  intended  to  be  used  in  a  moving  vehicle.  Because 
of  this,  the  components  must  be  robust  enough  to  handle  the 
impact  of  vehicle  vibration  and  motion. 

•  A  minimal  signature  is  preferred:  this  system  should  not  add 
considerably  to  the  vehicle’s  detectability. 


Control 


I 


Input 


Function  Name 


Function 

Number 


Output 


t 


Mechanism 


Figure  31  IDEFO  functional  model 


In  light  of  these  desired  capabilities  and  identified  constraints,  the  following 
sections  outline  our  functional  grouping,  and  discuss  the  work  accomplished  within 
that  function,  as  well  as  the  current  status  and  issues  experienced.  Figure  32  gives  an 
IDEFO  overview  of  the  functional  architecture  we  have  developed. 
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Figure  32  Functional  breakdown  diagram 
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a.  Sense  Panoramic  View 

This  function  embodies  the  capability  to  provide  the  inputs  to 
Panoramic  Indirect  Vision,  which  we  define  as  the  ability  for  the  user  to  view  the 
external  world  with  an  exceptionally  wide  field  of  view  while  under  armor 
protection.  This  requires  the  hardware  and  software  to  sense  panoramic  images: 
Sense  light  from  the  surrounding  environment  in  a  360-degree  arc  around  a  point, 
convert  this  light  into  pixels,  and  then  composite  these  pixels  together  into  a 
panoramic  image  frame. 


Figure  33  Ladybug®  2  spherical  camera 

To  provide  the  panoramic  sensor  function,  we  used  a  Point  Grey 
Research  Ladybug®  2  [50]  spherical  camera,  seen  in  Figure  33.  This  camera 
consists  of  six  individual  CCD  sensors  and  lenses  mounted  in  one  enclosure:  five 
organized  in  a  band  around  the  body,  and  one  pointed  directly  up.  The  cameras 
stream  images  to  the  Ladybug  software  via  a  compressor  unit  and  an  IEEE  1394b 
bus,  where  the  individual  camera  images  are  stitched  together  in  the 
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graphics  card  to  produce  one  large  panoramic  image  stream  (see  Figure  34), 
which  can  then  be  saved  as  video  files  or  still  pictures.  Table  5  outlines  Ladybug 
2’s  primary  output  parameters. 


Figure  34  Ladybug  panoramic  view 


The  Ladybug  2  camera  serves  two  purposes  in  the  PARPICE 
system:  the  first  is  to  provide  the  user  a  panoramic  view  of  the  external  world  as 
part  of  the  video  see-through  AR  system;  the  second  is  to  record  video  as  the 
vehicle  moves,  in  order  to  be  used  later  in  texturing  an  urban  model  built  from 
LiDAR  data. 


Table  5  Ladybug  2  properties 


Individual  Camera  Resolution 

1024  x  768  pixels 

Largest  Stitched  Image  Resolution 

5400x2700 

Refresh  Rate 

10-30Hz  (hardware  dependent) 
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V. 


The  Ladybug  2  camera  was  successfully  installed  on  the  PARPICE- 


Video  Capture  was  successfully  conducted  on  the  NPS  campus,  in 
synchronization  with  LiDAR  capture,  and  was  saved. 

Live  video  was  successfully  streamed  to  the  user  touch  screen  while 

driving. 


Our  software  using  the  Ladybug  Software  Development  Kit  was  still 
limited  in  capability,  particularly  in  the  process  for  stitching  the  six  camera  feeds 
into  a  single  frame  buffer,  and  then  transferring  that  frame  buffer  to  a  texture  on 
the  OpenSceneGraph  sphere  in  the  Vizard™  environment.  However,  stitched 
video  was  recorded  and  then  played  back  satisfactorily  in  our  test  system. 

b.  Determine  Pose 

Pose  of  the  camera  and  LiDAR  points  of  view  is  the  basic  -origin” 
data  to  which  all  other  relevant  data  is  spatially  registered.  This  function  provides 
the  ability  to  determine  the  pose  of  the  point  of  view:  The  ability  to  determine  the 
3D  position  (latitude,  longitude  and  altitude)  and  3D  orientation  (heading,  pitch  and 
roll)  of  a  first-person  point  of  view. 

This  functionality  is  in  a  very  limited  state  at  this  time.  A  delay  in 
hardware  availability  prevented  us  from  implementing  true  pose  determination. 
For  the  purpose  of  working  with  stored  data  in  the  lab,  pose  determination  was 
conducted  post-hoc  using  GIS  systems  and  situational  knowledge  of  the  test  run 
locations.  A  partial  prototype  system  was  constructed  using  a  Webcam  and  an 
Intersense  InertiaCube:  using  a  manually  constructed  3D  model  of  an  outdoor  site, 
and  a  tripod  that  mounted  the  camera  and  sensor,  very  promising  demonstrations 
of  AR  capabilities  were  conducted,  including  placing  and  interacting  with  icons. 
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c.  Interface  With  User 

This  function  involves  controlling  the  field  of  regard  of  the  camera 
(i.e.,  pan  and  tilt)  as  well  as  manual  selection  and  alteration  of  annotation  data.  In 
order  to  do  this,  the  user  must  be  able  to  select  a  portion  of  the  surroundings  to 
view,  from  the  entirety  of  a  panoramic  view.  Additionally,  the  user  must  be  able  to 
select  and  edit  annotations;  must  be  able  to  designate  one  or  more  of  the 
aforementioned  annotations  as  the  item  of  interest,  and  query  any  information  with 
which  it  is  associated  and,  then,  manipulate  that  information  if  desired  (to  include 
the  spatial  location  of  that  annotation). 

The  user  interface  function  was  developed  as  a  combination  of  two 
subfunctions:  tracking  the  user’s  head  in  order  to  control  the  field  of  view  on  the 
screen,  and  accepting  mouse  events  from  either  a  standard  USB  mouse  or  a 
touchscreen  overlay. 


(1)  Head  Tracking.  Since  we  utilized  a  spherical  camera,  we 
found  it  necessary  to  have  a  mechanism  to  control  the  view  being  displayed. 
Normally,  this  would  be  done  using  a  mouse,  a  touchpad,  or  a  joystick,  but  these 
are  not  convenient  for  use  in  a  moving  vehicle  (a  joystick  would  be  the  best  of 
those,  but  that  takes  away  the  use  of  one  hand).  A  novel  alternate  method  was 
found,  however,  through  the  use  of  a  head  tracker.  We  utilized  a  TrackIR™  4 
infrared  tracker  camera  [51],  in  conjunction  with  reflective  head  markers  to  control 
the  view  with  the  head  alone. 
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Figure  35  TrackIR™  camera  and  hat  with  reflective  markers 

The  TrackIR  device  from  Naturalpoint  works  using  computer 
vision  techniques  to  track  the  reflectors  mounted  on  the  user’s  head.  Since  the 
dimensions  between  the  markers  are  known  beforehand,  the  TrackIR  software 
calculates  their  positions  and  moves  the  view  accordingly.  With  three  markers,  the 
system  can  track  a  head  in  full  6DOF.  TrackIR  originally  was  intended  for  use  with 
first-person  shooter  video  games.  To  allow  the  player  to  look  around  without 
changing  his  direction  of  movement,  the  rotation  rates  of  the  view  are  amplified: 
the  user  tilts  his  head  a  small  amount  to  the  left,  and  can  move  the  view  in  the 
game  around  to  the  left,  up  to  directly  behind  him.  This  allows  the  user  to  look 
around  the  game  world  while  physically  continuing  to  gaze  at  the  computer  monitor 
in  front  of  him. 
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Figure  36  TrackIR  5  software  (depiction  of  head  orientation) 


In  our  implementation,  we  are  only  interested  in  the  pitch  and 
yaw  of  the  user’s  head.  Translation  and  roll  measurements  are  not  utilized 
currently,  because  we  are  viewing  the  world  as  seen  from  the  Ladybug  camera’s 
perspective,  and  this  does  not  translate  or  rotate  with  respect  to  the  vehicle. 
Because  of  this,  we  chose  to  turn  off  translation  and  roll  tracking  in  the  TrackIR 
software.  This  had  the  great  benefit  of  making  the  system  robust  to  the  motion  of 
the  user  due  to  the  motion  of  the  vehicle:  even  if  the  user  was  bouncing  up  and 
down,  the  head  tracker  maintained  a  good  track  on  the  intended  pitch  and  yaw  of 
the  user’s  head.  Because  we  did  not  have  access  to  the  TrackIR  API,  we  took  the 
pitch  and  yaw  output  from  the  tracker  and  ran  it  through  the  TIR2Joy  free  software 
package  [52],  which  in  turn  relies  on  the  joystick  emulator  program  PPJoy  [53], 
Using  this  setup,  the  head  tracker  becomes  visible  to  the  system  as  a  joystick  input 
device. 
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Investigation  of  this  head-tracking  view-control  technique  has 
the  potential  to  improve  the  usability  of  such  a  system:  the  operator  has  both 
hands  available  for  other  purposes,  such  as  interacting  with  the  annotations  on  the 
screen,  or  talking  on  the  radio.  Also,  we  hypothesize  that  utilizing  head-tracking 
instead  of  more  conventional  methods  has  the  potential  to  increase  the  total  ease 
of  integration  of  the  camera/annotation  video  picture  into  the  user’s  mental 
situational  model. 


(2)  Object  Selection,  Editing  and  Manipulation.  For  basic 
command  input,  we  utilized  simple  mouse  interaction  events,  in  conjunction  with  a 
touchscreen  integrated  into  an  LCD  display.  The  following  UML  sequence 
diagrams  sketch  the  concept  of  the  sequence  of  events  of  message  traffic 
between  different  components.  Our  core  software  package,  Vizard,  provides  the 
mouse  interaction  functionality  out  of  the  box. 
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Figure  38  UML2  sequence  diagram:  Add  icon 
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d.  Sense  Environmental  Geometry 

This  function  provides  the  ability  to  sense  the  3D  position  in  space  of 
the  objects  in  the  surrounding  field  of  view  around  a  particular  point  in  space,  in  a 
real-time  manner,  for  the  purpose  of  identifying  the  position  of  a  selected  object. 
Also,  it  provides  the  ability  to  determine  3D  locations  of  a  large  number  of  points  in 
a  panoramic  arc  around  a  point  in  space,  then  make  this  information  available  for 
queries  on  the  location  of  particular  points 

Our  system  incorporates  a  Velodyne®  HDL-64E  high  definition  360° 
LiDAR  scanner  [54],  This  device  consists  of  64  laser  rangefinders  arrayed  in  a 
26.8°  vertical  fan,  mounted  in  a  rotating  head.  As  the  head  spins  at  10Hz,  each 
laser  fires  2200  times  per  rotation,  receiving  the  beam  pulse  back  as  a  laser  return. 
This  pulse  is  timed,  and  a  distance  is  calculated  from  the  time  of  flight.  For  each 
rotation,  140,800  points  are  collected  and  streamed  via  UDP  packets  over  an 
Ethernet  cable. 

The  purpose  of  this  sensor  is  also  two-fold,  and  in  parallel  to  the 
images  from  the  Ladybug  camera:  first,  to  provide  a  live  depth  field  for  the  AR 
system,  in  order  to  determine  the  depth  of  objects  in  the  Ladybug  image;  and 
second,  to  scan  in  order  to  provide  3D  points  from  which  to  construct  urban  model 
geometry. 
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Figure  39  Raw  LiDAR  returns  from  Velodyne 


Figure  40  Velodyne  LiDAR,  Including  PARPICE-V  and  OpenSceneGraph-rendered 

LiDAR  points 
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The  LiDAR  and  Ladybug  are  simultaneously  operational  in  the 
PARPICE-V:  multiple  test  runs  were  conducted  on  the  NPS  campus  to  record 
synchronized  LiDAR  scans  and  spherical  video  captures,  for  lab  development 
purposes. 


5.  Generate  Annotations 

This  function  creates  the  ability  to  display  realistic  or  semi-realistic 
spatially  registered  computer  generated  imagery  on  top  of  a  live  view  of  the  real 
world  (either  optical  or  video  see-through). 


Figure  41  Screenshot  of  the  Vizard  development  environment 

Software  Core:  The  core  of  an  AR  system  is  its  software  engine. 
For  the  sake  of  timeliness,  we  chose  a  commercial  software  package  for  this 
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purpose:  The  Vizardsuite  from  WorldViz®,  Inc.  [55]  supplies  most  of  the  necessary 
functionality  for  our  prototype  system  (see  Figure  41  and  Figure  42  ).  It  provides 
the  following: 


(1)  Scene  Graph.  Vizardprovides  3D  graphics  scene 
rendering  abilities  by  incorporating  OpenSceneGraph.  This  open  source  scene 
graph  provides  the  necessary  data  structure  and  manipulation  capability  to 
organize,  group,  add  and  remove  renderable  objects  in  virtual  3D  space. 


Figure  42  Screenshot  of  Vizard  Code  being  edited  in  Eclipse/PyDev 


(2)  Peripheral  Connectivity.  Input  and  output  is  provided  by 
Vizard’s  incorporation  of  functionality  to  connect  to  various  input  and  output 
devices,  including  joysticks,  motion  trackers,  eye  trackers,  head-mounted  displays, 
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cameras,  and  various  other  devices.  This  allows  Vizard  to  be  the  central  system 
for  integrating  the  components  of  our  test  system  together. 

(3)  Plug-ins.  Vizard  comes  with  a  software  development  kit 
(SDK)  that  allows  developers  to  create  -plug-ins”  for  various  purposes,  including 
specialized  rendering  functions,  and  specialized  hardware  integration.  It  was  our 
intention  to  eventually  completely  connect  the  spherical  camera  and  the  LiDAR 
fully  with  Vizard,  although  this  has  not  been  possible  due  to  time  and  resource 
constraints.  While  we  think  this  is  not  the  optimal  solution,  the  Vizard  environment 
has  some  benefits  in  allowing  rapid  prototyping. 

Figure  43  shows  a  screenshot  of  our  Python  code  running  in 
Vizard.  The  top  portion  of  the  screen  is  a  live  360°  panoramic  view  from  the 
Ladybug.  The  bottom  two-thirds  of  the  screen  is  the  field  of  view  that  the  user  is 
currently  viewing.  The  green  rectangle  in  the  panoramic  view  corresponds  to  the 
borders  of  the  main  view.  This  screenshot  shows  how  video  from  the  Ladybug 
appears  when  textured  onto  an  OpenSceneGraph  sphere:  in  Vizard,  we  set  the 
sphere  to  be  drawn  first,  with  everything  else  drawn  over  it,  regardless  of  position. 
This  effectively  sets  the  live  video  at  infinite  distance  from  the  camera,  to  prevent 
obscuration  of  other  objects. 


92 


Figure  43  Screenshot  of  the  PARPICE  test  package  running  in  Vizard 

6.  Generate  &  Display  View 

The  purpose  of  this  function  is  to  provide  the  ability  to  display 
realistic  or  semi-realistic  spatially  registered  computer  generated  imagery  on  top  of 
a  live  view  of  the  real  world  (either  direct  or  indirect).  That  is,  the  ability  to  display 
to  a  human  eye  the  aforementioned  view,  in  a  manner  that  retains  the  visual 
features  of  that  view;  and  the  ability  to  display  spatially  registered  information, 
concurrently  inserted  into  a  live  view  of  the  environment  with  which  that  information 
is  registered 


Work  Done  and  Current  Status:  Because  our  system  is  video  see- 
through,  we  need  a  monitor  on  which  to  display  the  images  with  annotations.  In 
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this  case,  the  best  method  is  to  use  an  ordinary  flat-panel  display,  as  opposed  to 
an  HMD  of  some  sort,  or  a  HUD,  for  the  following  reasons: 

A  Visual  See-Through  HMD  has  many  drawbacks  in  a  moving 
armored  vehicle.  The  first  is  that  all  electronics  tend  to  break.  Due  to  our  intent  to 
be  unobtrusive  to  the  user,  this  makes  VST  HMDs  unsuitable:  if  a  VST  HMD 
breaks,  the  user  effectively  is  blindfolded  until  he  can  take  the  device  off.  In 
combat,  being  blinded  not  a  desirable  outcome.  Also,  because  the  VST  HMD 
blocks  out  the  view  of  the  real  world,  we  expect  users  to  be  prone  to  motion 
sickness. 


An  optical  HMD  is  also  less  than  suitable  in  this  application,  because 
the  optical  view  of  the  world  is  blocked  at  many  angles  by  the  sides  of  the  vehicle. 
This  affects  our  immediate  situational  awareness  problem,  which  would  fail  to  be 
addressed. 


A  HUD  is  also  not  suitable.  HUDs  have  been  used  in  combat  aircraft 
to  good  effect  because,  until  recently,  the  weapons  of  combat  aircraft  tended  to 
point  forward,  and  their  aim  points  could  be  displayed  on  the  HUD.  (This  recently 
has  changed  and  off-axis  capable  missiles  have  been  developed,  which  can  fire  to 
the  sides  of  an  aircraft.  Aircraft  with  these  weapons  are  equipped  with  HMDs  for 
the  pilots,  such  as  the  Joint  Helmet  Mounted  Cueing  System.)  A  crewmember  in  a 
HMMWV  does  not  enjoy  the  visibility  of  a  fighter  pilot. 
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Figure  44  Touch-screen  monitor  mounted  in  PARPICE-V 


For  these  reasons,  we  decided  to  use  a  flat  panel  monitor,  equipped 
with  a  touch  screen  in  order  to  interact  with  the  software  for  purposes  other  than 
view  control.  Issues  associated  with  this  display  method  include  screen  brightness 
and  contrast  limitations  in  an  outdoor  environment,  as  well  as  screen  glare. 
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Figure  45  Conceptual  view  of  system  from  user  station 
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VI.  FUTURE  WORK 


Under  the  assumption  that  investigation  in  this  project  will  be  ongoing,  we 
propose  future  targets  of  improvement  and  inquiry. 

A.  RESEARCH  QUESTIONS 

Separate  from  the  issue  of  improving  the  capabilities  of  the  system  is  the 
validation  of  these  capabilities  as  improvements  over  current  systems.  Augmented 
Reality  is  technologically  interesting  but,  at  this  time,  there  are  no  operational  systems 
to  test  and  compare  to  extant  methods.  Currently,  we  see  several  areas  of  research 
that  will  require  some  progress  to  conclusively  determine  any  benefit  to  the  use  of  AR. 

A  key  question  is,  -ean  AR  provide  significantly  enhanced  performance  over 
other  methods  of  situational  awareness  and  tactical  knowledge  persistence?”  Can 
AR  measurably  enhance  human  performance  in: 

•  Accuracy  and  precision  of  position  determination 

•  Expansion  of  the  spatial  extent  of  situational  awareness  of 
surroundings 

•  Timeliness  and  accuracy  of  querying  and  recovering  information. 

Additionally,  there  are  questions  of  research  that  are  not  necessarily  AR- 
exclusive,  but  deal  with  the  overall  capabilities  of  a  system  such  as  we  describe. 
What  performance  enhancements  could  such  a  system  provide  in  the  areas  of: 

•  Operational  after  action  review:  could  the  system  provide  concrete 
performance  data  of  units  in  actual  combat  operations? 

•  Urban  modeling:  could  the  system  improve  speed  and  accuracy  of 
3D  urban  model  creation? 
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•  Direct  and  indirect  fire  engagement:  could  the  system  improve 
speed  and  accuracy  of  the  application  of  direct  fire,  and  similarly 
enhance  calls  for  fire  support? 

B.  SYSTEM  IMPROVEMENT 

Although  the  system  currently  is  not  in  a  fully  operational  state,  an  operational 
implementation  would  provide  a  platform  to  incorporate  other  research  efforts  and  add 
capabilities  to  the  system.  In  order  from  easiest  to  implement  to  most  difficult  (or  even 
speculative),  some  of  these  are: 

1.  Incorporate  PTZ/Slaved  Camera 

Integrating  a  fast  pan-tilt-zoom  camera  is  a  good  step  toward  making 
zooming  possible  in  the  panoramic  image. 

2.  Increase  Camera  Resolution 

The  newer  Ladybug  3  has  higher  resolution,  and  might  improve 
performance.  Also,  the  compressor  unit  is  integrated,  so  there  is  only  one  piece  of 
hardware. 

3.  RWS  Integration 

Integrating  the  system  with  a  Remote  Weapon  Station  would  help 
address  the  immediate  SA  problem.  This  integration  would  involve  using  the 
PARPICE  system  as  a  commander’s  viewer,  and  would  add  one-touch  slew-to- 
cue  for  the  RWS  to  slew  to  the  point  the  commander  indicated,  for  engagement  by 
the  gunner. 

4.  Multiple  Crew  Stations 

Investigation  into  networking  the  Ladybug  to  broadcast  (or  multicast) 
within  the  vehicle  on  gigabit  Ethernet  would  be  worthwhile.  When  combined  with 
the  LiDAR  broadcasting  UDP,  this  would  allow  multiple  computers  to  display 

different  fields  of  view  to  different  crewmembers. 
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5.  Optical  Character  Recognition 

Auto-labeling/annotation  can  be  implemented,  allowing  business  and 
street  signs  to  be  read,  located,  and  added  to  the  annotation  database.  An 
extension  of  this  would  be  to  integrate  translation  software,  to  allow  parsing  of  local 
language  signs. 

6.  Change  Detection 

If  we  can  implement  saving  video  and  terrain,  then  we  can  potentially 
implement  live  change  detection  for  the  user:  changes  in  the  terrain  can  be 
highlighted  for  further  investigation.  A  necessity  for  this  capability  is  to  filter  out 
automobile  traffic. 

7.  Implement  LIDAR  Tracking 

LiDAR-based  tracking  is  a  very  important  area.  Implementing  the 
real-time  scanning  of  terrain  would  enable  live  tracking  with  the  LiDAR,  as  well  as 
model-based  tracking  that  would  not  require  the  LiDAR  to  be  constantly  activated. 
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VII.  CONCLUSIONS 


The  bottom  line  is  that  vehicle-mounted  AR  is  feasible.  This  research  has 
identified  several  ways  that  a  vehicle-mounted  augmented  reality  system  could 
address  perceived  gaps  in  vehicle  crew  capability. 

A.  KNOWLEDGE  PERSISTENCE 

We  have  identified  characteristics  of  methods  of  portraying  annotative  tactical 
data  that  can  be  implemented  using  an  AR  system.  With  our  system  as  an  interface 
with  the  world,  and  an  extensive  networked  data  system  to  compile  the  information 
collected,  knowledge  can  persist  spatially  in  the  place  it  originated. 

Table  6  Knowledge  persistence  performance  comparison  w/  PARPICE 


Problem:  Knowledge 

Persistence 

Assessment 

Average 
(1-5,  l=Best) 

Solutions 

Terrain 

View 

Available 

On-The- 

Move 

Update 

Frequency 

Spatial- 

Contextual 

Info 

Placement 

GIG 

Integration 

Paper  Maps  w/  Overlays 

4 

2 

5 

4 

5 

4 

Sand  Table 

3 

5 

4 

4 

5 

4.2 

Blue  Force  Tracking 
Systems 

3 

1 

2 

4 

3 

2.6 

Web-Based  Tactical 

Information  Assets 

3 

5 

2 

3 

1 

2.8 

Serious  Games 

2 

5 

4 

3 

3 

3.4 

| PARPI CE  (Projected) 

2 

1 

2 

1 

3 

1.8 

As  can  be  seen  from  Table  6  ,  in  comparison  with  existing  solutions,  an 
operational  PARPICE-type  system  can  be  expected  to  out-perform  current  methods 
of  addressing  the  knowledge-persistence  problem. 
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B.  CONSTRAINED-VIEW  SITUATIONAL  AWARENESS 

Previous  efforts  at  developing  a  useful  vehicle-mounted  augmented  reality 
display  and  user  interface  system  have  not  resulted  in  an  operational  system  to  date. 
We  have  outlined  a  system  that  can  provide  a  panoramic  AR  display,  while  taking  an 
unobtrusive  add-on  approach  requiring  less  sophisticated  display  technology. 

Table  7  Constrained-View  Situational  Awareness  performance 

comparison  with  PARPICE 


Problem:  Constrained- 

View  Situational 

Awareness 

Assessment 

Average 

(1-5, 

l=Best) 

Solutions 

Crew 

Protection 

Vehicle 

Commander 

Visibility 

Weapon 

System 

Integration 

Spatial- 

Contextual 

Info 

Placement 

Human  Gunner- 

Observer 

4 

4 

3 

4 

3.75 

Remote  Weapon  Station 

2 

4 

1 

5 

3 

See-Through  Turret 

2 

2 

4 

4 

3 

Similarly  to  the  previous  discussion,  Table  7  illustrates  that,  in  comparison  with 
existing  solutions,  an  operational  PARPICE-type  system  can  also  be  expected  to  out¬ 
perform  current  methods  of  addressing  the  constrained-view  situational  awareness 
problem. 

In  all,  an  operational  system  incorporating  our  functional  components  has  the 
potential  to  provide  an  increase  in  situational  awareness;  quicker  and  more  accurate 
information  access  and  knowledge  persistence;  better  crew  survivability  and  greater 
avoidance  of  threats. 
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